Dashboard Tools#
Assumptions
This guide assumes that you are familiar and comfortable with the following
the command line interface including pipes, file redirection, and standard utilities like
sedandgrepGit
markdown authoring
the basics of how HTML, CSS, and JavaScript interact to create a web page.
On this page, you will find definitions of tools that are used to build the hub dashboard. They are organized by tools used generally across the dashboard, tools for specific dashboard elements, and finally tools for operations. This page is not meant to be a comprehensive description of the dashboard operation. Details for each of the dashboard elements and the overall workflow are provided in subsequent pages.
General tools#
We use a wide range of tools to build the dashboards. Ultimately, the only tools you will need to build a dashboard locally are Git, python, and docker. Everything else is encapsulated within docker images.
Definitions of general tools#
- Python
Python is the backbone of hub-dashboard-predtimechart and is used in the control room to get a list of repositories that have the app installed.
- docker
We use docker to containerize the tools needed to build the website and the data for the evaluations.
- BASH
We use BASH to orchestrate building of the dashboard website and within the GitHub workflows. In particular, we make use of conditional expressions and parameter expansion (e.g.
${HUB%/}takes the variable$HUBand removes any trailing slash).- R
R is the backbone of hubPredEvalsData, which generates the evaluations dashboard
- JavaScript
Both of the visualizations are built as JavaScript modules. In turn, these modules are called from scripts that are loaded in a page of the dashboard, which replaces a specific
<div>element of the webpage with the content. This paradigm is described in How state management works? Dead simple SM in vanilla JavaScript
Website#
The website is orchestrated with the docker image hub-dash-site-builder. The image bundles BASH and yq to join the user data with the template and quarto to render the markdown to HTML.
Definitions of website tools#
- quarto
This is the website engine. It is responsible for converting markdown to HTML and applying styling.
- yq
YAML Query. This tool is similar to the command line JSON processor, jq, except it works with YAML. It is responsible for joining the dashboard’s
site-config.ymltostatic/_quarto.yml. The rationale behind this is that a user does not have to learn how to use quarto in order to generate a site.
Forecast visualization#
The forecasts visualization is built with PredTimeChart, a JavaScript module that displays forecast visualizations.
The data for PredTimeChart is converted from hub format with hub-dashboard-predtimechart, a command-line Python app, which uses the polars to read and convert the data to JSON format.
Definitions of forecast tools#
- polars
A data manipulation library that provides lazy data frame utilities in Python. It gives our tool the ability to read and slice hub data. Eventually, this will be subserseded by the hub-data package.
Evaluations visualization#
The evaluations visualization is built with PredEvals, a JavaScript module that displays the evaluation visualization. Its code is heavily based off of predtimechart.
The evaluations visualization data are built with the hubPredEvalsData-docker docker image. This bundles the R package hubPredEvalsData, which uses hubEvals and scoringutils to evaluate model performance if the hub has oracle output available.
Why use Docker?
We use docker to build this image because, while we use R heavily in this lab, maintaining workflows that use R can be challenging. A docker image is a portable way to take care of that.
Definitions of evaluations tools#
- hubPredEvalsData
generates nested folders of scores disaggregated by task ID
- hubEvals
does the heavy lifting to score model output against oracle output
- scoringutils
used for determining what metrics are available for a given output type, also provides the backend for hubEvals, so it’s a “free” dependency.
Operations#
This is platform-specific
It is important that the operations bit of the dashboard is explicitly designed to work with GitHub. If GitHub were to disappear or become unusable, then the above tools will still work for local operations.
The takeaway: operations implement a workflow on a specific platform. If you understand the overall workflow (next chapter), then you can configure operations on any platform.
The operations of all the tools above is performed with GitHub Actions though a series of workflows in the hub-dashboard-control-room repository.
The entirety of the operations happens on the ubuntu-latest runner of
GitHub Actions. This is a virtual machine that has a bunch of useful software
installed (you can find a list of the available tools for the Ubuntu
Runner
in the actions/runner-images repository). For our purposes, it has
docker,
BASH, yq,
jq, gh, and
curl pre-installed.
Important references#
Broadly important references (e.g. you will be reaching back to these often):
Choosing how to run a workflow. For example, we use a few scenarios:
schedule,push,workflow_dispatch(human triggered) andworkflow_call(workflows triggered from a separate repository).Workflow commands, (especially “setting an output parameter”) allows you to use values between job steps
Evaluating expressions. You will see things like
${{ fromJSON(steps.status.outputs.forecast-ok) || fromJSON(steps.status.outputs.eval-ok) }}. This guide tells you what they mean and how to evaluate them.Reusing workflows. In order to get the workflows in the hands of hub administrators without requiring them to understand and maintain complex bits of machinery, we create reusable workflows that can be run as individual jobs.
Broad table of tasks and tools used for each#
task/component |
tools |
github workflow documentation |
|---|---|---|
Fetching dashboard repository |
Git |
|
Determining what to build |
Setting an output parameter, Passing information between jobs |
|
Checking for resources |
||
Fetching hub repository |
Git |
actions/checkout (linked above) |
docker |
Running jobs in a container (linked above) |
|
Passing data between Jobs |
actions/upload-artifact |
|
Pushing content to GitHub |
actions/download-artifact, BASH, Git |
Definition of operations tools#
- gh
GitHub’s CLI interface is useful for performing operations that involve the GitHub API as well as operations like creating pull requests. This comes pre-installed on all GitHub Actions runners.
- curl
Command line tool for interacting with URLs. This comes standard with macOS and on GitHub Actions runners. It is particularly useful to check if a URL is valid (e.g. for checking if a resource is available):
curl -o /dev/null --silent -Iw '%{http_code}' "https://hubverse.io" # 200
- jq
Command line tool that allows you to manipulate JSON from the command line. This is very useful for parsing queries to the GitHub API. It’s also really useful for parsing the
tasks.jsonfile of a hub. For example, here’s a quick way to get all of the known output types across rounds and model tasks:jq '[.rounds[].model_tasks[].output_type | keys] | flatten' < hub-config/tasks.json