Skip to contents

Given a data set with forecasts, count the number of available forecasts for arbitrary grouping (e.g. the number of forecasts per model, or the number of forecasts per model and location). This is useful to determine whether there are any missing forecasts.

Usage

avail_forecasts(data, by = NULL, collapse = c("quantile", "sample"))

Arguments

data

A data.frame or data.table with the predictions and observations. For scoring using score(), the following columns need to be present:

  • true_value - the true observed values

  • prediction - predictions or predictive samples for one true value. (You only don't need to provide a prediction column if you want to score quantile forecasts in a wide range format.)

For scoring integer and continuous forecasts a sample column is needed:

  • sample - an index to identify the predictive samples in the prediction column generated by one model for one true value. Only necessary for continuous and integer forecasts, not for binary predictions.

For scoring predictions in a quantile-format forecast you should provide a column called quantile:

  • quantile: quantile to which the prediction corresponds

In addition a model column is suggested and if not present this will be flagged and added to the input data with all forecasts assigned as an "unspecified model").

You can check the format of your data using check_forecasts() and there are examples for each format (example_quantile, example_continuous, example_integer, and example_binary).

by

character vector or NULL (the default) that denotes the categories over which the number of forecasts should be counted. By default (by = NULL) this will be the unit of a single forecast (i.e. all available columns (apart from a few "protected" columns such as 'prediction' and 'true value') plus "quantile" or "sample" where present).

collapse

character vector (default is c("quantile", "sample") with names of categories for which the number of rows should be collapsed to one when counting. For example, a single forecast is usually represented by a set of several quantiles or samples and collapsing these to one makes sure that a single forecast only gets counted once.

Value

A data.table with columns as specified in by and an additional column with the number of forecasts.

Examples

# \dontshow{
  data.table::setDTthreads(2) # restricts number of cores used on CRAN
# }

avail_forecasts(example_quantile,
  collapse = c("quantile"),
  by = c("model", "target_type")
)
#> The following messages were produced when checking inputs:
#> 1.  144 values for `prediction` are NA in the data provided and the corresponding rows were removed. This may indicate a problem if unexpected.
#>                    model target_type Number forecasts
#> 1: EuroCOVIDhub-ensemble       Cases              128
#> 2: EuroCOVIDhub-baseline       Cases              128
#> 3:  epiforecasts-EpiNow2       Cases              128
#> 4: EuroCOVIDhub-ensemble      Deaths              128
#> 5: EuroCOVIDhub-baseline      Deaths              128
#> 6:       UMass-MechBayes      Deaths              128
#> 7:  epiforecasts-EpiNow2      Deaths              119