Dashboard Tools#

Assumptions

This guide assumes that you are familiar and comfortable with the following

  • the command line interface including pipes, file redirection, and standard utilities like sed and grep

  • Git

  • the concept of a modelling hub

  • YAML syntax

  • markdown authoring

  • the basics of how HTML, CSS, and JavaScript interact to create a web page.

On this page, you will find definitions of tools that are used to build the hub dashboard. They are organized by tools used generally across the dashboard, tools for specific dashboard elements, and finally tools for operations. This page is not meant to be a comprehensive description of the dashboard operation. Details for each of the dashboard elements and the overall workflow are provided in subsequent pages.

General tools#

We use a wide range of tools to build the dashboards. Ultimately, the only tools you will need to build a dashboard locally are Git, python, and docker. Everything else is encapsulated within docker images.

Definitions of general tools#

Python

Python is the backbone of hub-dashboard-predtimechart and is used in the control room to get a list of repositories that have the app installed.

docker

We use docker to containerize the tools needed to build the website and the data for the evaluations.

BASH

We use BASH to orchestrate building of the dashboard website and within the GitHub workflows. In particular, we make use of conditional expressions and parameter expansion (e.g. ${HUB%/} takes the variable $HUB and removes any trailing slash).

R

R is the backbone of hubPredEvalsData, which generates the evaluations dashboard

JavaScript

Both of the visualizations are built as JavaScript modules. In turn, these modules are called from scripts that are loaded in a page of the dashboard, which replaces a specific <div> element of the webpage with the content. This paradigm is described in How state management works? Dead simple SM in vanilla JavaScript

Website#

The website is orchestrated with the docker image hub-dash-site-builder. The image bundles BASH and yq to join the user data with the template and quarto to render the markdown to HTML.

Definitions of website tools#

quarto

This is the website engine. It is responsible for converting markdown to HTML and applying styling.

yq

YAML Query. This tool is similar to the command line JSON processor, jq, except it works with YAML. It is responsible for joining the dashboard’s site-config.yml to static/_quarto.yml. The rationale behind this is that a user does not have to learn how to use quarto in order to generate a site.

Forecast visualization#

The forecasts visualization is built with PredTimeChart, a JavaScript module that displays forecast visualizations.

The data for PredTimeChart is converted from hub format with hub-dashboard-predtimechart, a command-line Python app, which uses the polars to read and convert the data to JSON format.

Definitions of forecast tools#

polars

A data manipulation library that provides lazy data frame utilities in Python. It gives our tool the ability to read and slice hub data. Eventually, this will be subserseded by the hub-data package.

Evaluations visualization#

The evaluations visualization is built with PredEvals, a JavaScript module that displays the evaluation visualization. Its code is heavily based off of predtimechart.

The evaluations visualization data are built with the hubPredEvalsData-docker docker image. This bundles the R package hubPredEvalsData, which uses hubEvals and scoringutils to evaluate model performance if the hub has oracle output available.

Why use Docker?

We use docker to build this image because, while we use R heavily in this lab, maintaining workflows that use R can be challenging. A docker image is a portable way to take care of that.

Definitions of evaluations tools#

hubPredEvalsData

generates nested folders of scores disaggregated by task ID

hubEvals

does the heavy lifting to score model output against oracle output

scoringutils

used for determining what metrics are available for a given output type, also provides the backend for hubEvals, so it’s a “free” dependency.

Operations#

This is platform-specific

It is important that the operations bit of the dashboard is explicitly designed to work with GitHub. If GitHub were to disappear or become unusable, then the above tools will still work for local operations.

The takeaway: operations implement a workflow on a specific platform. If you understand the overall workflow (next chapter), then you can configure operations on any platform.

The operations of all the tools above is performed with GitHub Actions though a series of workflows in the hub-dashboard-control-room repository.

The entirety of the operations happens on the ubuntu-latest runner of GitHub Actions. This is a virtual machine that has a bunch of useful software installed (you can find a list of the available tools for the Ubuntu Runner in the actions/runner-images repository). For our purposes, it has docker, BASH, yq, jq, gh, and curl pre-installed.

Important references#

Broadly important references (e.g. you will be reaching back to these often):

  • Choosing how to run a workflow. For example, we use a few scenarios: schedule, push, workflow_dispatch (human triggered) and workflow_call (workflows triggered from a separate repository).

  • Workflow syntax

  • Workflow commands, (especially “setting an output parameter”) allows you to use values between job steps

  • Evaluating expressions. You will see things like ${{ fromJSON(steps.status.outputs.forecast-ok) || fromJSON(steps.status.outputs.eval-ok) }}. This guide tells you what they mean and how to evaluate them.

  • Reusing workflows. In order to get the workflows in the hands of hub administrators without requiring them to understand and maintain complex bits of machinery, we create reusable workflows that can be run as individual jobs.

Broad table of tasks and tools used for each#

task/component

tools

github workflow documentation

Fetching dashboard repository

Git

actions/checkout

Determining what to build

BASH, yq

Setting an output parameter, Passing information between jobs

Checking for resources

BASH, gh, curl, jq

branches API endpoint

Fetching hub repository

Git

actions/checkout (linked above)

Website

docker

Running jobs in a container

Evaluation visualization

docker

Running jobs in a container (linked above)

Forecast visualization

python

actions/setup-python

Passing data between Jobs

actions/upload-artifact

actions/upload-artifact

Pushing content to GitHub

actions/download-artifact, BASH, Git

actions/download-artifact

Definition of operations tools#

gh

GitHub’s CLI interface is useful for performing operations that involve the GitHub API as well as operations like creating pull requests. This comes pre-installed on all GitHub Actions runners.

curl

Command line tool for interacting with URLs. This comes standard with macOS and on GitHub Actions runners. It is particularly useful to check if a URL is valid (e.g. for checking if a resource is available):

curl -o /dev/null --silent -Iw '%{http_code}' "https://hubverse.io"
# 200
jq

Command line tool that allows you to manipulate JSON from the command line. This is very useful for parsing queries to the GitHub API. It’s also really useful for parsing the tasks.json file of a hub. For example, here’s a quick way to get all of the known output types across rounds and model tasks:

jq '[.rounds[].model_tasks[].output_type | keys] | flatten' < hub-config/tasks.json