Skip to contents

score() applies a selection of scoring metrics to a forecast object. score() is a generic that dispatches to different methods depending on the class of the input data.

See as_forecast_binary(), as_forecast_quantile() etc. for information on how to create a forecast object.

See get_forecast_unit() for more information on the concept of a forecast unit.

For additional help and examples, check out the paper Evaluating Forecasts with scoringutils in R.

Usage

# S3 method for class 'forecast_binary'
score(forecast, metrics = get_metrics(forecast), ...)

# S3 method for class 'forecast_nominal'
score(forecast, metrics = get_metrics(forecast), ...)

# S3 method for class 'forecast_ordinal'
score(forecast, metrics = get_metrics(forecast), ...)

# S3 method for class 'forecast_point'
score(forecast, metrics = get_metrics(forecast), ...)

# S3 method for class 'forecast_quantile'
score(forecast, metrics = get_metrics(forecast), ...)

# S3 method for class 'forecast_sample'
score(forecast, metrics = get_metrics(forecast), ...)

score(forecast, metrics, ...)

Arguments

forecast

A forecast object (a validated data.table with predicted and observed values).

metrics

A named list of scoring functions. Names will be used as column names in the output. See get_metrics() for more information on the default metrics used. See the Customising metrics section below for information on how to pass custom arguments to scoring functions.

...

Currently unused. You cannot pass additional arguments to scoring functions via .... See the Customising metrics section below for details on how to use purrr::partial() to pass arguments to individual metrics.

Value

An object of class scores. This object is a data.table with unsummarised scores (one score per forecast) and has an additional attribute metrics with the names of the metrics used for scoring. See summarise_scores()) for information on how to summarise scores.

Details

Customising metrics

If you want to pass arguments to a scoring function, you need change the scoring function itself via e.g. purrr::partial() and pass an updated list of functions with your custom metric to the metrics argument in score(). For example, to use interval_coverage() with interval_range = 90, you would define a new function, e.g. interval_coverage_90 <- purrr::partial(interval_coverage, interval_range = 90) and pass this new function to metrics in score().

Note that if you want to pass a variable as an argument, you can unquote it with !! to make sure the value is evaluated only once when the function is created. Consider the following example:

custom_arg <- "foo"
print1 <- purrr::partial(print, x = custom_arg)
print2 <- purrr::partial(print, x = !!custom_arg)

custom_arg <- "bar"
print1() # prints 'bar'
print2() # prints 'foo'

References

Bosse NI, Gruson H, Cori A, van Leeuwen E, Funk S, Abbott S (2022) Evaluating Forecasts with scoringutils in R. doi:10.48550/arXiv.2205.07090

Author

Nikos Bosse nikosbosse@gmail.com

Examples

library(magrittr) # pipe operator

validated <- as_forecast_quantile(example_quantile)
#>  Some rows containing NA values may be removed. This is fine if not
#>   unexpected.
score(validated) %>%
  summarise_scores(by = c("model", "target_type"))
#>                    model target_type         wis overprediction underprediction
#>                   <char>      <char>       <num>          <num>           <num>
#> 1: EuroCOVIDhub-ensemble       Cases 17943.82383   10043.121943     4237.177310
#> 2: EuroCOVIDhub-baseline       Cases 28483.57465   14096.100883    10284.972826
#> 3:  epiforecasts-EpiNow2       Cases 20831.55662   11906.823030     3260.355639
#> 4: EuroCOVIDhub-ensemble      Deaths    41.42249       7.138247        4.103261
#> 5: EuroCOVIDhub-baseline      Deaths   159.40387      65.899117        2.098505
#> 6:       UMass-MechBayes      Deaths    52.65195       8.978601       16.800951
#> 7:  epiforecasts-EpiNow2      Deaths    66.64282      18.892583       15.893314
#>    dispersion        bias interval_coverage_50 interval_coverage_90   ae_median
#>         <num>       <num>                <num>                <num>       <num>
#> 1: 3663.52458 -0.05640625            0.3906250            0.8046875 24101.07031
#> 2: 4102.50094  0.09796875            0.3281250            0.8203125 38473.60156
#> 3: 5664.37795 -0.07890625            0.4687500            0.7890625 27923.81250
#> 4:   30.18099  0.07265625            0.8750000            1.0000000    53.13281
#> 5:   91.40625  0.33906250            0.6640625            1.0000000   233.25781
#> 6:   26.87239 -0.02234375            0.4609375            0.8750000    78.47656
#> 7:   31.85692 -0.00512605            0.4201681            0.9075630   104.74790

# set forecast unit manually (to avoid issues with scoringutils trying to
# determine the forecast unit automatically)
example_quantile %>%
  as_forecast_quantile(
    forecast_unit = c(
      "location", "target_end_date", "target_type", "horizon", "model"
    )
  ) %>%
  score()
#>  Some rows containing NA values may be removed. This is fine if not
#>   unexpected.
#>      location target_end_date target_type horizon                 model
#>        <char>          <Date>      <char>   <num>                <char>
#>   1:       DE      2021-05-08       Cases       1 EuroCOVIDhub-ensemble
#>   2:       DE      2021-05-08       Cases       1 EuroCOVIDhub-baseline
#>   3:       DE      2021-05-08       Cases       1  epiforecasts-EpiNow2
#>   4:       DE      2021-05-08      Deaths       1 EuroCOVIDhub-ensemble
#>   5:       DE      2021-05-08      Deaths       1 EuroCOVIDhub-baseline
#>  ---                                                                   
#> 883:       IT      2021-07-24      Deaths       2 EuroCOVIDhub-baseline
#> 884:       IT      2021-07-24      Deaths       3       UMass-MechBayes
#> 885:       IT      2021-07-24      Deaths       2       UMass-MechBayes
#> 886:       IT      2021-07-24      Deaths       3  epiforecasts-EpiNow2
#> 887:       IT      2021-07-24      Deaths       2  epiforecasts-EpiNow2
#>               wis overprediction underprediction  dispersion  bias
#>             <num>          <num>           <num>       <num> <num>
#>   1:  7990.854783   2.549870e+03       0.0000000 5440.985217  0.50
#>   2: 16925.046957   1.527583e+04       0.0000000 1649.220870  0.95
#>   3: 25395.960870   1.722226e+04       0.0000000 8173.700000  0.90
#>   4:    53.880000   0.000000e+00       0.6086957   53.271304 -0.10
#>   5:    46.793043   2.130435e+00       0.0000000   44.662609  0.30
#>  ---                                                              
#> 883:    80.336957   3.608696e+00       0.0000000   76.728261  0.20
#> 884:     4.881739   4.347826e-02       0.0000000    4.838261  0.10
#> 885:    25.581739   1.782609e+01       0.0000000    7.755652  0.80
#> 886:    19.762609   5.478261e+00       0.0000000   14.284348  0.50
#> 887:    66.161739   4.060870e+01       0.0000000   25.553043  0.90
#>      interval_coverage_50 interval_coverage_90 ae_median
#>                    <lgcl>               <lgcl>     <num>
#>   1:                 TRUE                 TRUE     12271
#>   2:                FALSE                FALSE     25620
#>   3:                FALSE                 TRUE     44192
#>   4:                 TRUE                 TRUE        14
#>   5:                 TRUE                 TRUE        15
#>  ---                                                    
#> 883:                 TRUE                 TRUE        53
#> 884:                 TRUE                 TRUE         1
#> 885:                FALSE                 TRUE        46
#> 886:                 TRUE                 TRUE        26
#> 887:                FALSE                 TRUE       108

# forecast formats with different metrics
if (FALSE) { # \dontrun{
score(as_forecast_binary(example_binary))
score(as_forecast_quantile(example_quantile))
score(as_forecast_point(example_point))
score(as_forecast_sample(example_sample_discrete))
score(as_forecast_sample(example_sample_continuous))
} # }