Skip to contents

Filters the data and turns values into NA before the data gets passed to plot_predictions(). The reason to do this is to this is that it allows to 'filter' prediction and truth data separately. Any value that is NA will then be removed in the subsequent call to plot_predictions().

Usage

make_NA(data = NULL, what = c("truth", "forecast", "both"), ...)

make_na(data = NULL, what = c("truth", "forecast", "both"), ...)

Arguments

data

A data.frame or data.table with the predictions and observations. For scoring using score(), the following columns need to be present:

  • true_value - the true observed values

  • prediction - predictions or predictive samples for one true value. (You only don't need to provide a prediction column if you want to score quantile forecasts in a wide range format.)

For scoring integer and continuous forecasts a sample column is needed:

  • sample - an index to identify the predictive samples in the prediction column generated by one model for one true value. Only necessary for continuous and integer forecasts, not for binary predictions.

For scoring predictions in a quantile-format forecast you should provide a column called quantile:

  • quantile: quantile to which the prediction corresponds

In addition a model column is suggested and if not present this will be flagged and added to the input data with all forecasts assigned as an "unspecified model").

You can check the format of your data using check_forecasts() and there are examples for each format (example_quantile, example_continuous, example_integer, and example_binary).

what

character vector that determines which values should be turned into NA. If what = "truth", values in the column 'true_value' will be turned into NA. If what = "forecast", values in the column 'prediction' will be turned into NA. If what = "both", values in both column will be turned into NA.

...

logical statements used to filter the data

Value

A data.table

Examples

make_NA (
    example_continuous,
    what = "truth",
    target_end_date >= "2021-07-22",
    target_end_date < "2021-05-01"
  )
#>        location location_name target_end_date target_type forecast_date
#>     1:       DE       Germany      2021-01-02       Cases          <NA>
#>     2:       DE       Germany      2021-01-02      Deaths          <NA>
#>     3:       DE       Germany      2021-01-09       Cases          <NA>
#>     4:       DE       Germany      2021-01-09      Deaths          <NA>
#>     5:       DE       Germany      2021-01-16       Cases          <NA>
#>    ---                                                                 
#> 35620:       IT         Italy      2021-07-24      Deaths    2021-07-12
#> 35621:       IT         Italy      2021-07-24      Deaths    2021-07-12
#> 35622:       IT         Italy      2021-07-24      Deaths    2021-07-12
#> 35623:       IT         Italy      2021-07-24      Deaths    2021-07-12
#> 35624:       IT         Italy      2021-07-24      Deaths    2021-07-12
#>                       model horizon prediction sample true_value
#>     1:                 <NA>      NA         NA     NA         NA
#>     2:                 <NA>      NA         NA     NA         NA
#>     3:                 <NA>      NA         NA     NA         NA
#>     4:                 <NA>      NA         NA     NA         NA
#>     5:                 <NA>      NA         NA     NA         NA
#>    ---                                                          
#> 35620: epiforecasts-EpiNow2       2  159.84534     36         NA
#> 35621: epiforecasts-EpiNow2       2  128.21214     37         NA
#> 35622: epiforecasts-EpiNow2       2  190.52560     38         NA
#> 35623: epiforecasts-EpiNow2       2  141.06659     39         NA
#> 35624: epiforecasts-EpiNow2       2   24.43419     40         NA