Target (Observed) Data#
Definitions#
Target data are the observed data being modeled as the prediction target in a collaborative modeling exercise. These data come in two forms:
-
1Time series data is sometimes referred to as “ground truth” data, but
we no longer use this term in the hubverse.
time series data1Time series data is sometimes referred to as “ground truth” data, but we no longer use this term in the hubverse., which are the observed counts or rates partitioned for each unique combination of task id values.
oracle output data are derived from the time series data and represent model output that would have been generated if the target data values had been known ahead of time.
Hubverse tools like hubVis make use of the time series data for visualizations while other hubverse tools like hubEvals and hubEnsembles make use of the oracle output data for model evaluations. We describe these formats briefly here, and give more detail about the oracle outputs in the remainder of this document.
Uses of target time series data and oracle output#
Each data format is useful for different purposes (see table below). Modelers will most often estimate model parameters by fitting to the raw data in time series format. Both data formats may be useful for different kinds of data visualizations; for example, a plot of time series predictions in quantile format may use the raw time series data, while a plot of pmf predictions for a categorical target may use the oracle output. The primary use case of oracle output is for evaluation.
Data Format |
Model Estimation |
Plotting |
Evaluation |
---|---|---|---|
Time series |
✅ |
✅ |
|
Oracle output |
✅ |
✅ |
Common uses for target time series and oracle output data. A ✅ indicates which data formats are most commonly used for each purpose.
File formats#
Both the time series and oracle output data are found in the target-data/
directory of a hub with the following conventions:
time series data MUST be named
time-series
oracle output data MUST is named
oracle-output
files MUST be either
*.csv
or*.parquet
CSV files MUST be a single continuous file named either
time-series.csv
ororacle-output.csv
parquet files MAY be partitioned (see partitioning target data for details)
For example, this represents a valid time series data set because it is (1)
named “time-series”, (3) file extensions end with .parquet
, and is (5)
partitioned.
target-data/
└── time-series/
├── as_of=2023-06-03
│ └── part-0.parquet
├── as_of=2023-06-10
│ └── part-0.parquet
└── as_of=2023-06-17
└── part-0.parquet
However, if the files above were “csv” files, this would violate (4). For a CSV time series target file, this is valid:
target-data/
└── time-series.csv
Time series#
The first format is time series data. This is often the native or “raw” format for data. Each row of the data set is a unit of observation and the columns consist of:
task ID variables that uniquely define the unit of observation. This must include at least one column representing the date.
an
observation
column that records the observed value
Here is an example of this form of data, showing selected dates for
Massachusetts (FIPS code 25), drawn from the forecasting example in
hubExamples
:
date |
location |
observation |
---|---|---|
2022-11-19 |
25 |
79 |
2022-11-26 |
25 |
221 |
2022-12-03 |
25 |
446 |
2022-12-10 |
25 |
578 |
Here, the unit of observation is a date and location pair. That is, for each date and location there is a single observed value.
In settings where a hub is working with multiple observed targets at
each time point (e.g., cases, hospitalizations, and deaths), the values
of those targets will be part of the unit of observation, with a column such as
target
indicating what quantity is reported in each row.
date |
target |
location |
observation |
---|---|---|---|
2022-11-19 |
cases |
25 |
79 |
2022-11-26 |
cases |
25 |
221 |
2022-12-03 |
cases |
25 |
446 |
2022-12-10 |
cases |
25 |
578 |
2022-11-19 |
deaths |
25 |
9 |
2022-11-26 |
deaths |
25 |
21 |
2022-12-03 |
deaths |
25 |
46 |
2022-12-10 |
deaths |
25 |
78 |
Optional as_of
column to record data versions#
Time series data are expected to be compiled from an authoritative upstream data source after each target date. Because of reporting delays the data may initially be represented by one value that could be updated in one or more subsequent versions of the data.
as_of |
date |
location |
observation |
---|---|---|---|
2022-12-03 |
2022-11-19 |
25 |
79 |
2022-12-03 |
2022-11-26 |
25 |
221 |
2022-12-03 |
2022-12-03 |
25 |
420 |
as_of |
date |
location |
observation |
---|---|---|---|
2022-12-10 |
2022-11-19 |
25 |
79 |
2022-12-10 |
2022-11-26 |
25 |
221 |
2022-12-10 |
2022-12-03 |
25 |
446 |
2022-12-10 |
2022-12-10 |
25 |
578 |
If the source data have this pattern of being subsequently updated,
the hubverse recommends recording the date target data were
reported in a column called as_of
. This will then accurately represent what data were available at a given point in time, and will allow tools like our
dashboards to automatically extract the data that were available for any given model round.
Additional columns#
Hubverse tools will only validate columns that make up the unit of observation that match model task IDs. You may also include additional columns that have a 1:1 correspondence with the data, for example a transformation of counts to rates or a human-readable translation of codes.
Oracle output#
Oracle output follows a format that is similar to a hubverse model output file, with three main differences:
Predictions correspond to a distribution that places probability 1 on the observed target outcome (see figure below).
Predictions (e.g., means, quantile values, or pmf category probabilities, etc.) are stored in a column named
oracle_value
rather thanvalue
.Generally, the columns of the oracle output will be a subset of the columns of valid model output for the hub, with just those columns that are needed to correctly align
oracle_value
s with the corresponding predictedvalue
s produced by modelers. We introduce some conventions to avoid duplication of data, described in more detail below.

Model and Oracle distributions#
Just like model outputs are derived from a model distribution, oracle output values are derived from distributions with a probability of 1 on the observed target.
The oracle output is designed to align with model
output task ID and model output representation
columns. This allows the two to be merged so that value
can be compared
and evaluated against the corresponding oracle_value
. The important
difference between the outputs is that the oracle output is necessarily going
to have a subset of the task ID columns as the model output data and, depending
on the hub, may not have either of the model output representation columns.
Example#
Here is an example of this form of data, based on the forecasting
example in hubExamples
:
location |
target_end_date |
target |
output_type |
output_type_id |
oracle_value |
---|---|---|---|---|---|
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
|
79 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
|
221 |
25 |
2022-12-03 |
wk inc flu hosp |
quantile |
|
446 |
25 |
2022-12-10 |
wk inc flu hosp |
quantile |
|
578 |
In this example, the observed weekly influenza hospitalization count in
MA on the week ending 2022-11-19 was 79. A probability distribution that
places probability 1 on that outcome will have all quantiles equal to
that observed value, so 79 appears as the oracle_value
for quantile
outputs for that location
and target_end_date
. The use of <NA>
for
the output_type_id
represents the fact that this oracle_value
is
relevant for all quantile levels; this convention will be described in
more detail below.
For comparison, here is the corresponding model output showing two
horizons from the Flusight-baseline
model for the 2022-11-19
reference date (the columns model_id
and reference_date
are omitted
for compactness):
horizon |
location |
target_end_date |
target |
output_type |
output_type_id |
value |
---|---|---|---|---|---|---|
0 |
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
0.05 |
22 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
0.1 |
31 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
0.25 |
45 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
0.5 |
51 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
0.75 |
57 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
0.9 |
71 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
0.95 |
80 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
0.05 |
5 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
0.1 |
21 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
0.25 |
38 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
0.5 |
51 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
0.75 |
64 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
0.9 |
81 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
0.95 |
97 |
Generating oracle output data#
A hub will typically have access to data in time series format, and will need to convert it to the oracle output format for use with any tools that require it in that format (see the next section). In hubs that collect mean, median, quantile, or sample predictions for the reported signal values in the raw time series data, the two formats may be essentially the same, perhaps with some renaming of columns. However, these data formats will differ more in hubs that form predictions for quantities that are derived from the the raw time series data, such as the peak time or peak incidence, and in hubs that collect pmf or cdf predictions.
Task ID columns#
The oracle output should include enough of the task ID variables to uniquely
identify which oracle_values
correspond to which predicted values. In the
above oracle output example, the location
,
target_end_date
, and target
columns are included because they are necessary
to identify where and when a given target was measured as the
oracle_value
.
Similarly, any task ID variables that are not necessary to match observations
with predictions can be omitted from the oracle output. In the above oracle
output example, the horizon
, model_id
, and
reference_date
columns are not included. Both horizon
and reference_date
are related to the target_end_date
and thus would be redundant. Importantly,
these task ID variables are not applicable for observed data—they are used
for describing model-specific parameters about unknown events. Likewise, in a
scenario projection setting, the scenario_id
can be omitted as there is only
one scenario for an observed event2just don’t tell the quantum physicists..
Model output representation columns#
The output_type
and output_type_id
columns only need to be included if
the hub collects pmf
or cdf
outputs. For those two output types the
oracle_value
depends on the output_type_id
(see the next section for more
detail). On the
other hand, the oracle_value
is not specific to the quantile level for
quantile forecasts or the sample index for sample forecasts, and so for these
output types (as well as mean and median), the output_type_id
is not needed
to align observations with predictions.
The oracle_value
column#
Oracle output follows a similar format as model outputs, but the value
column is named oracle_value
, and it contains the value of the
prediction that would be reported if the observed value of the target
was known with certainty. The implications of this vary depending on the
output_type
:
For the
mean
,median
,quantile
, andsample
output types, theoracle_value
is the observed value of the prediction target. Thisoracle_value
is the same for all quantile levels and all sample indices, since a predictive distribution that places all of its probability on the observed outcome will have all quantiles equal to that value and all samples from that distribution will be equal to the observed value.For
pmf
andcdf
output types, theoracle_value
is either1
or0
For the
pmf
output type, theoracle_value
is1
when theoutput_type_id
corresponds to the observed category (indicating a probability of 1 for that category) and0
for other categories.For the
cdf
output type, theoracle_value
is0
for anyoutput_type_id
levels that are less than the observed value, and1
for anyoutput_type_id
levels that are greater than or equal to the observed value, corresponding to the step function cdf of a probability distribution that places all of its probability at the observed value.
Examples of the oracle output format#
We will illustrate the above concepts using the example forecast data
from hubExamples
that was discussed briefly in the overview section;
please see the forecast_data
vignette
in hubExamples for more detail about these data.
Briefly, this example is for a hub with five task id variables:
The
location
column contains a FIPS code identifying the location being predicted.The
reference_date
is a date in ISO format that gives the Saturday ending the week the predictions were generated.The
horizon
gives the difference between thereference_date
and the target date of the forecasts (target_end_date
, see next item) in units of weeks. Informally, this describes “how far ahead” the predictions are targeting.The
target_end_date
is a date in ISO format that gives the Saturday ending the week being predicted. For example, if thetarget_end_date
is"2022-12-17"
, predictions are for a quantity relating to influenza activity in the week from Sunday, December 11, 2022 through Saturday, December 17, 2022.The
target
describes the target quantity for the prediction.
There are three targets
, all based on measures of weekly influenza
hospitalizations, with forecasts collected in different output_type
s
for each target, as is summarized in the following table:
target |
output_type |
description |
---|---|---|
wk inc flu hosp |
quantile, median, mean, sample |
weekly count of hospital admissions with flu |
wk flu hosp rate |
cdf |
week rate of hospital admissions with flu per 100,000 population |
wk flu hosp rate category |
pmf |
categorical severity level of the hospital admissions rate, with levels ‘low’, ‘moderate’, ‘high’, and ‘very high’ |
Below, we show snippets of the contents of a model_out_tbl
with
example forecast submissions and the corresponding oracle output for
each output_type
. We highlight two points about these objects:
The
reference_date
andhorizon
columns are included in the model outputs, but they are not included in the oracle output.In this example, the oracle output for the
mean
,median
,quantile
, andsample
output types are all the same, and they contain<NA>
values for theoutput_type_id
. In a hub withoutpmf
orcdf
output types, theoutput_type
andoutput_type_id
columns could be omitted and this duplication could be eliminated.
Note
These examples are all collected and filtered from the hubExamples
package. The model output data set contains over
10,000 rows and the oracle output data has over 200,000 rows.
To make comparisons easier, we have subset the data to Massachusetts (FIPS code
25) with one reference_date
of 2022-11-19 and four target end dates between 2022-11-19 and 2022-12-10.
In addition, for the model output data, we are only showing the
Flusight-baseline
model for the 2022-11-19 reference date and removing the
model_id
and reference_date
columns.
Output type mean
#
horizon |
location |
target_end_date |
target |
output_type |
output_type_id |
value |
---|---|---|---|---|---|---|
0 |
25 |
2022-11-19 |
wk inc flu hosp |
mean |
|
51.18476 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
mean |
|
51.39129 |
2 |
25 |
2022-12-03 |
wk inc flu hosp |
mean |
|
51.89889 |
3 |
25 |
2022-12-10 |
wk inc flu hosp |
mean |
|
52.54409 |
location |
target_end_date |
target |
output_type |
output_type_id |
oracle_value |
---|---|---|---|---|---|
25 |
2022-11-19 |
wk inc flu hosp |
mean |
|
79 |
25 |
2022-11-26 |
wk inc flu hosp |
mean |
|
221 |
25 |
2022-12-03 |
wk inc flu hosp |
mean |
|
446 |
25 |
2022-12-10 |
wk inc flu hosp |
mean |
|
578 |
For the mean
output type, the oracle_value
is the numeric value of
the prediction target. Here, the first row of the oracle output
indicates that 79 flu hospitalizations were reported in Massachusetts for the
week ending on 2022-11-19. This can be viewed as the mean of a
“predictive distribution” that is entirely concentrated on that observed
value. The use of <NA>
for the output_type_id
matches the convention
for model output with the mean output type.
Output type median
#
horizon |
location |
target_end_date |
target |
output_type |
output_type_id |
value |
---|---|---|---|---|---|---|
0 |
25 |
2022-11-19 |
wk inc flu hosp |
median |
|
51 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
median |
|
51 |
2 |
25 |
2022-12-03 |
wk inc flu hosp |
median |
|
51 |
3 |
25 |
2022-12-10 |
wk inc flu hosp |
median |
|
51 |
location |
target_end_date |
target |
output_type |
output_type_id |
oracle_value |
---|---|---|---|---|---|
25 |
2022-11-19 |
wk inc flu hosp |
median |
|
79 |
25 |
2022-11-26 |
wk inc flu hosp |
median |
|
221 |
25 |
2022-12-03 |
wk inc flu hosp |
median |
|
446 |
25 |
2022-12-10 |
wk inc flu hosp |
median |
|
578 |
The oracle_value
for the median
output type is the same as for the
mean
output type: the numeric value of the prediction target. This is
the median of a distribution that is entirely concentrated on that
observed value. Again, the use of <NA>
for the output_type_id
matches the convention for model output with the median output type.
Output type quantile
#
horizon |
location |
target_end_date |
target |
output_type |
output_type_id |
value |
---|---|---|---|---|---|---|
0 |
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
0.05 |
22 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
0.1 |
31 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
0.25 |
45 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
0.5 |
51 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
0.75 |
57 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
0.9 |
71 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
0.95 |
80 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
0.05 |
5 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
0.1 |
21 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
0.25 |
38 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
0.5 |
51 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
0.75 |
64 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
0.9 |
81 |
1 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
0.95 |
97 |
location |
target_end_date |
target |
output_type |
output_type_id |
oracle_value |
---|---|---|---|---|---|
25 |
2022-11-19 |
wk inc flu hosp |
quantile |
|
79 |
25 |
2022-11-26 |
wk inc flu hosp |
quantile |
|
221 |
25 |
2022-12-03 |
wk inc flu hosp |
quantile |
|
446 |
25 |
2022-12-10 |
wk inc flu hosp |
quantile |
|
578 |
As with the mean
and median
output types, the oracle_value
for a
quantile type is the observed numeric value of the prediction target,
which is the quantile of a predictive distribution that assigns
probability 1 to that observed value at any quantile probability level. A
model output file would need to have a separate row for each quantile
level reported in the output_type_id
column. As a space-saving
convention, we use output_type_id = <NA>
to indicate that this
oracle_value
applies to all quantile levels.
Output type sample
#
horizon |
location |
target_end_date |
target |
output_type |
output_type_id |
value |
---|---|---|---|---|---|---|
0 |
25 |
2022-11-19 |
wk inc flu hosp |
sample |
2101 |
-2 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
sample |
2102 |
2 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
sample |
2103 |
52 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
sample |
2104 |
47 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
sample |
2105 |
56 |
0 |
25 |
2022-11-19 |
wk inc flu hosp |
sample |
2106 |
46 |
location |
target_end_date |
target |
output_type |
output_type_id |
oracle_value |
---|---|---|---|---|---|
25 |
2022-11-19 |
wk inc flu hosp |
sample |
|
79 |
25 |
2022-11-26 |
wk inc flu hosp |
sample |
|
221 |
25 |
2022-12-03 |
wk inc flu hosp |
sample |
|
446 |
25 |
2022-12-10 |
wk inc flu hosp |
sample |
|
578 |
As with the above output types, the oracle_value
for a sample type is
the observed numeric value of the prediction target since all samples
from a predictive distribution that assigns probability 1 to the observed
value will be equal to that value. A model output file would need to
have a separate row for each sample, with the sample index recorded in
the output_type_id
column. We use output_type_id = <NA>
to indicate
that this oracle_value
applies to all predictive samples.
Output type pmf
#
horizon |
location |
target_end_date |
target |
output_type |
output_type_id |
value |
---|---|---|---|---|---|---|
0 |
25 |
2022-11-19 |
wk flu hosp rate category |
pmf |
low |
0.9999997 |
0 |
25 |
2022-11-19 |
wk flu hosp rate category |
pmf |
moderate |
0.0000003 |
0 |
25 |
2022-11-19 |
wk flu hosp rate category |
pmf |
high |
0.0000000 |
0 |
25 |
2022-11-19 |
wk flu hosp rate category |
pmf |
very high |
0.0000000 |
1 |
25 |
2022-11-26 |
wk flu hosp rate category |
pmf |
low |
0.9999983 |
1 |
25 |
2022-11-26 |
wk flu hosp rate category |
pmf |
moderate |
0.0000017 |
1 |
25 |
2022-11-26 |
wk flu hosp rate category |
pmf |
high |
0.0000000 |
1 |
25 |
2022-11-26 |
wk flu hosp rate category |
pmf |
very high |
0.0000000 |
2 |
25 |
2022-12-03 |
wk flu hosp rate category |
pmf |
low |
0.9997501 |
2 |
25 |
2022-12-03 |
wk flu hosp rate category |
pmf |
moderate |
0.0002499 |
2 |
25 |
2022-12-03 |
wk flu hosp rate category |
pmf |
high |
0.0000000 |
2 |
25 |
2022-12-03 |
wk flu hosp rate category |
pmf |
very high |
0.0000000 |
location |
target_end_date |
target |
output_type |
output_type_id |
oracle_value |
---|---|---|---|---|---|
25 |
2022-11-19 |
wk flu hosp rate category |
pmf |
low |
1 |
25 |
2022-11-19 |
wk flu hosp rate category |
pmf |
moderate |
0 |
25 |
2022-11-19 |
wk flu hosp rate category |
pmf |
high |
0 |
25 |
2022-11-19 |
wk flu hosp rate category |
pmf |
very high |
0 |
25 |
2022-11-26 |
wk flu hosp rate category |
pmf |
low |
0 |
25 |
2022-11-26 |
wk flu hosp rate category |
pmf |
moderate |
1 |
25 |
2022-11-26 |
wk flu hosp rate category |
pmf |
high |
0 |
25 |
2022-11-26 |
wk flu hosp rate category |
pmf |
very high |
0 |
25 |
2022-12-03 |
wk flu hosp rate category |
pmf |
low |
0 |
25 |
2022-12-03 |
wk flu hosp rate category |
pmf |
moderate |
0 |
25 |
2022-12-03 |
wk flu hosp rate category |
pmf |
high |
1 |
25 |
2022-12-03 |
wk flu hosp rate category |
pmf |
very high |
0 |
The presence of a 1
for the oracle_value
in the first row and 0
in the
subsequent three rows indicates that the observed rate category in
Massachusetts on the week of 2022-11-19 was "low"
. Similarly, the observed rate category for the week of
2022-11-26 was "moderate"
.
Output type cdf
#
horizon |
location |
target_end_date |
target |
output_type |
output_type_id |
value |
---|---|---|---|---|---|---|
0 |
25 |
2022-11-19 |
wk flu hosp rate |
cdf |
0.25 |
0.0409498 |
0 |
25 |
2022-11-19 |
wk flu hosp rate |
cdf |
0.5 |
0.1310412 |
0 |
25 |
2022-11-19 |
wk flu hosp rate |
cdf |
0.75 |
0.5679516 |
0 |
25 |
2022-11-19 |
wk flu hosp rate |
cdf |
1 |
0.8911202 |
0 |
25 |
2022-11-19 |
wk flu hosp rate |
cdf |
1.25 |
0.9650988 |
0 |
25 |
2022-11-19 |
wk flu hosp rate |
cdf |
1.5 |
0.9850981 |
location |
target_end_date |
target |
output_type |
output_type_id |
oracle_value |
---|---|---|---|---|---|
25 |
2022-11-19 |
wk flu hosp rate |
cdf |
0.25 |
0 |
25 |
2022-11-19 |
wk flu hosp rate |
cdf |
0.5 |
0 |
25 |
2022-11-19 |
wk flu hosp rate |
cdf |
0.75 |
0 |
25 |
2022-11-19 |
wk flu hosp rate |
cdf |
1 |
0 |
25 |
2022-11-19 |
wk flu hosp rate |
cdf |
1.25 |
1 |
25 |
2022-11-19 |
wk flu hosp rate |
cdf |
1.5 |
1 |
The presence of a 0
for the oracle_value
in the first four rows and a
1
for the oracle_value
in subsequent rows indicates that the
observed hospitalization rate in the US in the week of 2022-11-19 was
greater than 1 but less than or equal to 1.25. These oracle_value
s
encode a step function CDF that is equal to 0 when the output_type_id
is less than the observed rate and jumps to 1 at the observed rate.
How hubs should provide access to target time series data and oracle output#
Hubs should ensure that standardized procedures for accessing target data are available. The data formats that a hub provides may depend on the needs of the specific hub, and which hubverse tools the hub wants to use. For example, a hub that will not be conducting evaluations by comparing predictions to observed target values may not need to provide data in the oracle output format.
Access to target time series data and oracle output can be provided in either of two ways:
by providing example code for accessing target time series data and/or oracle output programmatically
by storing snapshots of the target time series data and/or oracle output in the hub repository in the
target-data
folder.
Following general conventions for storage of code related to modeling
hubs, we recommend that any code for data access be provided in a
separate repository following standard language-specific packaging
guidelines, or if the code is small in scope it can be placed within the
src
folder of the hub’s repository.