Evaluate forecasts — score.forecast

score() applies a selection of scoring metrics to a forecast object. score() is a generic that dispatches to different methods depending on the class of the input data.

See as_forecast_binary(), as_forecast_quantile() etc. for information on how to create a forecast object.

See get_forecast_unit() for more information on the concept of a forecast unit.

For additional help and examples, check out the paper Evaluating Forecasts with scoringutils in R.

Usage

# S3 method for class 'forecast_binary'
score(forecast, metrics = get_metrics(forecast), ...)

# S3 method for class 'forecast_sample_multivariate'
score(forecast, metrics = get_metrics(forecast), ...)

# S3 method for class 'forecast_nominal'
score(forecast, metrics = get_metrics(forecast), ...)

# S3 method for class 'forecast_ordinal'
score(forecast, metrics = get_metrics(forecast), ...)

# S3 method for class 'forecast_point'
score(forecast, metrics = get_metrics(forecast), ...)

# S3 method for class 'forecast_quantile'
score(forecast, metrics = get_metrics(forecast), ...)

# S3 method for class 'forecast_sample'
score(forecast, metrics = get_metrics(forecast), ...)

score(forecast, metrics, ...)

Arguments

forecast: A forecast object (a validated data.table with predicted and observed values).
metrics: A named list of scoring functions. Names will be used as column names in the output. See get_metrics() for more information on the default metrics used. See the Customising metrics section below for information on how to pass custom arguments to scoring functions.
...: Currently unused. You cannot pass additional arguments to scoring functions via .... See the Customising metrics section below for details on how to use purrr::partial() to pass arguments to individual metrics.

Value

An object of class scores. This object is a data.table with unsummarised scores (one score per forecast) and has an additional attribute metrics with the names of the metrics used for scoring. See summarise_scores()) for information on how to summarise scores.

Details

Customising metrics

If you want to pass arguments to a scoring function, you need change the scoring function itself via e.g. purrr::partial() and pass an updated list of functions with your custom metric to the metrics argument in score(). For example, to use interval_coverage() with interval_range = 90, you would define a new function, e.g. interval_coverage_90 <- purrr::partial(interval_coverage, interval_range = 90) and pass this new function to metrics in score().

Note that if you want to pass a variable as an argument, you can unquote it with !! to make sure the value is evaluated only once when the function is created. Consider the following example:

custom_arg <- "foo"
print1 <- purrr::partial(print, x = custom_arg)
print2 <- purrr::partial(print, x = !!custom_arg)

custom_arg <- "bar"
print1() # prints 'bar'
print2() # prints 'foo'

References

Bosse NI, Gruson H, Cori A, van Leeuwen E, Funk S, Abbott S (2022) Evaluating Forecasts with scoringutils in R. doi:10.48550/arXiv.2205.07090

Author

Nikos Bosse nikosbosse@gmail.com

Examples

library(magrittr) # pipe operator

validated <- as_forecast_quantile(example_quantile)
#> ℹ Some rows containing NA values may be removed. This is fine if not
#>   unexpected.
score(validated) %>%
  summarise_scores(by = c("model", "target_type"))
#>                    model target_type         wis overprediction underprediction
#>                   <char>      <char>       <num>          <num>           <num>
#> 1: EuroCOVIDhub-ensemble       Cases 17943.82383   10043.121943     4237.177310
#> 2: EuroCOVIDhub-baseline       Cases 28483.57465   14096.100883    10284.972826
#> 3:  epiforecasts-EpiNow2       Cases 20831.55662   11906.823030     3260.355639
#> 4: EuroCOVIDhub-ensemble      Deaths    41.42249       7.138247        4.103261
#> 5: EuroCOVIDhub-baseline      Deaths   159.40387      65.899117        2.098505
#> 6:       UMass-MechBayes      Deaths    52.65195       8.978601       16.800951
#> 7:  epiforecasts-EpiNow2      Deaths    66.64282      18.892583       15.893314
#>    dispersion        bias interval_coverage_50 interval_coverage_90   ae_median
#>         <num>       <num>                <num>                <num>       <num>
#> 1: 3663.52458 -0.05640625            0.3906250            0.8046875 24101.07031
#> 2: 4102.50094  0.09796875            0.3281250            0.8203125 38473.60156
#> 3: 5664.37795 -0.07890625            0.4687500            0.7890625 27923.81250
#> 4:   30.18099  0.07265625            0.8750000            1.0000000    53.13281
#> 5:   91.40625  0.33906250            0.6640625            1.0000000   233.25781
#> 6:   26.87239 -0.02234375            0.4609375            0.8750000    78.47656
#> 7:   31.85692 -0.00512605            0.4201681            0.9075630   104.74790

# set forecast unit manually (to avoid issues with scoringutils trying to
# determine the forecast unit automatically)
example_quantile %>%
  as_forecast_quantile(
    forecast_unit = c(
      "location", "target_end_date", "target_type", "horizon", "model"
    )
  ) %>%
  score()
#> ℹ Some rows containing NA values may be removed. This is fine if not
#>   unexpected.
#>      location target_end_date target_type horizon                 model
#>        <char>          <Date>      <char>   <num>                <char>
#>   1:       DE      2021-05-08       Cases       1 EuroCOVIDhub-ensemble
#>   2:       DE      2021-05-08       Cases       1 EuroCOVIDhub-baseline
#>   3:       DE      2021-05-08       Cases       1  epiforecasts-EpiNow2
#>   4:       DE      2021-05-08      Deaths       1 EuroCOVIDhub-ensemble
#>   5:       DE      2021-05-08      Deaths       1 EuroCOVIDhub-baseline
#>  ---                                                                   
#> 883:       IT      2021-07-24      Deaths       2 EuroCOVIDhub-baseline
#> 884:       IT      2021-07-24      Deaths       3       UMass-MechBayes
#> 885:       IT      2021-07-24      Deaths       2       UMass-MechBayes
#> 886:       IT      2021-07-24      Deaths       3  epiforecasts-EpiNow2
#> 887:       IT      2021-07-24      Deaths       2  epiforecasts-EpiNow2
#>               wis overprediction underprediction  dispersion  bias
#>             <num>          <num>           <num>       <num> <num>
#>   1:  7990.854783   2.549870e+03       0.0000000 5440.985217  0.50
#>   2: 16925.046957   1.527583e+04       0.0000000 1649.220870  0.95
#>   3: 25395.960870   1.722226e+04       0.0000000 8173.700000  0.90
#>   4:    53.880000   0.000000e+00       0.6086957   53.271304 -0.10
#>   5:    46.793043   2.130435e+00       0.0000000   44.662609  0.30
#>  ---                                                              
#> 883:    80.336957   3.608696e+00       0.0000000   76.728261  0.20
#> 884:     4.881739   4.347826e-02       0.0000000    4.838261  0.10
#> 885:    25.581739   1.782609e+01       0.0000000    7.755652  0.80
#> 886:    19.762609   5.478261e+00       0.0000000   14.284348  0.50
#> 887:    66.161739   4.060870e+01       0.0000000   25.553043  0.90
#>      interval_coverage_50 interval_coverage_90 ae_median
#>                    <lgcl>               <lgcl>     <num>
#>   1:                 TRUE                 TRUE     12271
#>   2:                FALSE                FALSE     25620
#>   3:                FALSE                 TRUE     44192
#>   4:                 TRUE                 TRUE        14
#>   5:                 TRUE                 TRUE        15
#>  ---                                                    
#> 883:                 TRUE                 TRUE        53
#> 884:                 TRUE                 TRUE         1
#> 885:                FALSE                 TRUE        46
#> 886:                 TRUE                 TRUE        26
#> 887:                FALSE                 TRUE       108

# forecast formats with different metrics
if (FALSE) { # \dontrun{
score(as_forecast_binary(example_binary))
score(as_forecast_quantile(example_quantile))
score(as_forecast_point(example_point))
score(as_forecast_sample(example_sample_discrete))
score(as_forecast_sample(example_sample_continuous))
} # }

# multivariate forecasts
if (FALSE) { # \dontrun{
score(example_multivariate_sample)
} # }