Getting started
introduction.Rmd
Introduction
This vignette briefly outlines the functionality of
EpiSoon
. To get started load the required packages.
- Load the package (
bsts
for models,ggplot2
for plotting, andcowplot
for theming)
Forecast Rts, score and plot
- We use an example dataframe built into the package but this could be replaced with your own data.
EpiSoon::example_obs_rts
#> rt date
#> 1 2.490547 2020-03-01
#> 2 2.442588 2020-03-02
#> 3 2.402473 2020-03-03
#> 4 2.335572 2020-03-04
#> 5 2.266551 2020-03-05
#> 6 2.192293 2020-03-06
#> 7 2.146429 2020-03-07
#> 8 2.104371 2020-03-08
#> 9 2.059281 2020-03-09
#> 10 2.027134 2020-03-10
#> 11 2.014678 2020-03-11
#> 12 1.998946 2020-03-12
#> 13 1.968350 2020-03-13
#> 14 1.947376 2020-03-14
#> 15 1.906984 2020-03-15
#> 16 1.812842 2020-03-16
#> 17 1.718532 2020-03-17
#> 18 1.665646 2020-03-18
#> 19 1.639927 2020-03-19
#> 20 1.633795 2020-03-20
#> 21 1.682025 2020-03-21
#> 22 1.561653 2020-03-22
- Fit a
bsts
model and produce a Rt forecast. Any appropriately wrapped model can be used (seebsts_model
andfable_model
for an examples).
rt_forecast <- forecast_rt(EpiSoon::example_obs_rts[1:10, ],
model = function(...) {
EpiSoon::bsts_model(model = function(ss, y) {
bsts::AddAutoAr(ss, y = y, lags = 10)
}, ...)
},
horizon = 21, samples = 10
)
rt_forecast
#> # A tibble: 210 × 4
#> sample date rt horizon
#> <int> <date> <dbl> <int>
#> 1 1 2020-03-11 1.91 1
#> 2 2 2020-03-11 1.87 1
#> 3 3 2020-03-11 2.00 1
#> 4 4 2020-03-11 2.05 1
#> 5 5 2020-03-11 1.95 1
#> 6 6 2020-03-11 1.88 1
#> 7 7 2020-03-11 2.04 1
#> 8 8 2020-03-11 2.01 1
#> 9 9 2020-03-11 2.11 1
#> 10 10 2020-03-11 1.97 1
#> # … with 200 more rows
- Score the forecast
rt_scores <- score_forecast(rt_forecast, EpiSoon::example_obs_rts)
rt_scores
#> date horizon mad bias dss crps log_score
#> 1: 2020-03-11 1 0.09175791 -0.4 -4.960342 0.02509914 -1.35338492
#> 2: 2020-03-12 2 0.07202600 -0.8 -3.656287 0.08264761 -0.02621921
#> 3: 2020-03-13 3 0.09241439 -0.8 -3.243241 0.07971169 -0.55461560
#> 4: 2020-03-14 4 0.14355771 -0.6 -2.809504 0.09396054 -0.25999440
#> 5: 2020-03-15 5 0.16137700 -0.8 -2.628656 0.10368522 -0.29800454
#> 6: 2020-03-16 6 0.15476805 -0.6 -2.986082 0.07695705 -0.48214772
#> 7: 2020-03-17 7 0.24889952 -0.2 -2.764084 0.07300620 -0.48391848
#> 8: 2020-03-18 8 0.19470960 -0.2 -2.786012 0.06215868 -0.60809943
#> 9: 2020-03-19 9 0.17690844 -0.4 -2.589514 0.06937450 -0.44580413
#> 10: 2020-03-20 10 0.20351518 -0.4 -2.613183 0.08122202 -0.40118836
#> 11: 2020-03-21 11 0.14920744 -0.6 -1.778662 0.14347108 0.14138580
#> 12: 2020-03-22 12 0.13109268 -0.6 -2.337791 0.09556781 -0.24995000
#> ae_median se_mean
#> 1: 0.02883256 0.001261780
#> 2: 0.12449090 0.009224522
#> 3: 0.09940933 0.013190625
#> 4: 0.14989639 0.021292461
#> 5: 0.13531045 0.026493365
#> 6: 0.13115021 0.016458738
#> 7: 0.12199514 0.013831701
#> 8: 0.08895699 0.009858108
#> 9: 0.09774695 0.011738112
#> 10: 0.11822794 0.018760277
#> 11: 0.19880696 0.061918948
#> 12: 0.14373182 0.025612203
- Summarise the forecast scores
summarise_scores(rt_scores)
#> # A tibble: 7 × 8
#> score bottom lower median mean upper top sd
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 ae_median 0.0454 0.0990 0.123 0.120 0.137 0.185 0.0407
#> 2 bias -0.8 -0.65 -0.6 -0.533 -0.4 -0.2 0.215
#> 3 crps 0.0353 0.0721 0.0805 0.0822 0.0944 0.133 0.0278
#> 4 dss -4.60 -3.05 -2.78 -2.93 -2.61 -1.93 0.786
#> 5 log_score -1.15 -0.502 -0.423 -0.418 -0.257 0.0953 0.366
#> 6 mad 0.0775 0.121 0.152 0.152 0.181 0.236 0.0511
#> 7 se_mean 0.00345 0.0113 0.0151 0.0191 0.0224 0.0522 0.0153
- Summarise the forecast
summarised_rt_forecast <- summarise_forecast(rt_forecast)
summarised_rt_forecast
#> # A tibble: 21 × 9
#> date horizon median mean sd bottom lower upper top
#> <date> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2020-03-11 1 1.99 1.98 0.0788 1.87 1.95 2.05 2.11
#> 2 2020-03-12 2 1.87 1.90 0.115 1.79 1.81 1.89 2.20
#> 3 2020-03-13 3 1.87 1.85 0.151 1.62 1.80 1.93 2.18
#> 4 2020-03-14 4 1.80 1.80 0.178 1.51 1.62 1.80 2.15
#> 5 2020-03-15 5 1.77 1.74 0.177 1.46 1.66 1.88 2.08
#> 6 2020-03-16 6 1.68 1.68 0.177 1.41 1.60 1.81 1.98
#> 7 2020-03-17 7 1.60 1.60 0.228 1.22 1.54 1.79 1.87
#> 8 2020-03-18 8 1.58 1.57 0.238 1.14 1.54 1.76 1.92
#> 9 2020-03-19 9 1.54 1.53 0.263 1.09 1.47 1.68 1.90
#> 10 2020-03-20 10 1.52 1.50 0.237 1.11 1.39 1.66 1.84
#> # … with 11 more rows
- Plot the forecast against observed data
plot_forecast(summarised_rt_forecast, EpiSoon::example_obs_rts)
Forecast cases, score and plot
- Forecasting cases requires the observed cases on which the observed Rt estimates were based
EpiSoon::example_obs_cases
#> # A tibble: 63 × 2
#> cases date
#> <dbl> <date>
#> 1 1 2020-01-20
#> 2 0 2020-01-21
#> 3 1 2020-01-22
#> 4 0 2020-01-23
#> 5 0 2020-01-24
#> 6 0 2020-01-25
#> 7 1 2020-01-26
#> 8 0 2020-01-27
#> 9 0 2020-01-28
#> 10 0 2020-01-29
#> # … with 53 more rows
- It also requires an assumption to be made about the serial interval (defined using probability distribution).
EpiSoon::example_serial_interval
#> 1 2 3 4 5 6 7 8 9 10 11 12 14
#> 0.00 0.03 0.25 0.17 0.09 0.15 0.13 0.05 0.05 0.03 0.02 0.01 0.01 0.01
- Forecast cases (using the case data on which the observed Rt estimates were based)
case_forecast <- forecast_cases(EpiSoon::example_obs_cases, rt_forecast,
serial_interval = EpiSoon::example_serial_interval
)
case_forecast
#> sample date cases horizon
#> 1: 1 2020-03-11 165 1
#> 2: 1 2020-03-12 184 2
#> 3: 1 2020-03-13 209 3
#> 4: 1 2020-03-14 196 4
#> 5: 1 2020-03-15 243 5
#> ---
#> 206: 10 2020-03-27 639 17
#> 207: 10 2020-03-28 736 18
#> 208: 10 2020-03-29 781 19
#> 209: 10 2020-03-30 841 20
#> 210: 10 2020-03-31 860 21
- Score the cases forecast
case_scores <- score_case_forecast(case_forecast, EpiSoon::example_obs_cases)
case_scores
#> date horizon mad bias dss crps ae_median se_mean
#> 1: 2020-03-11 1 11.1195 0.0 4.710994 3.17 2.5 7.29
#> 2: 2020-03-12 2 7.4130 -0.6 5.550119 5.65 9.5 72.25
#> 3: 2020-03-13 3 18.5325 0.6 6.630581 9.35 15.0 222.01
#> 4: 2020-03-14 4 41.5128 -0.2 7.135925 9.41 8.0 22.09
#> 5: 2020-03-15 5 25.2042 -0.2 7.719823 8.10 4.5 184.96
#> 6: 2020-03-16 6 47.4432 0.8 8.874911 24.04 35.0 2560.36
#> 7: 2020-03-17 7 58.5627 0.6 8.981421 30.22 33.5 2304.00
#> 8: 2020-03-18 8 93.4038 0.2 9.244150 24.61 23.0 1466.89
#> 9: 2020-03-19 9 126.7623 0.2 9.506010 32.11 22.5 345.96
#> 10: 2020-03-20 10 134.1753 0.0 9.857893 28.36 7.5 148.84
#> 11: 2020-03-21 11 180.1359 -0.4 10.742918 79.01 115.0 10941.16
#> 12: 2020-03-22 12 138.6231 0.6 11.336577 99.69 131.5 27489.64
- Summarise the cases scores
summarise_scores(case_scores)
#> # A tibble: 6 × 8
#> score bottom lower median mean upper top sd
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 ae_median 3.05 7.88 18.8 34.0 33.9 127. 43.2
#> 2 bias -0.545 -0.2 0.1 0.133 0.6 0.745 0.446
#> 3 crps 3.85 9.04 24.3 29.5 30.7 94.0 30.1
#> 4 dss 4.94 7.01 8.93 8.36 9.59 11.2 2.04
#> 5 mad 8.43 23.5 53.0 73.6 129. 169. 58.8
#> 6 se_mean 11.4 130. 284. 3814. 2368. 22939. 8063.
- Summarise the cases forecast
summarised_case_forecast <- summarise_case_forecast(case_forecast)
summarised_case_forecast
#> # A tibble: 21 × 9
#> date horizon median mean sd bottom lower upper top
#> <date> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2020-03-11 1 170. 170. 10.7 154 165 178 188
#> 2 2020-03-12 2 184. 186. 13.6 160 182 190 212
#> 3 2020-03-13 3 223 223. 23.0 179 205 227 263
#> 4 2020-03-14 4 243 246. 37.0 196 213 254 321
#> 5 2020-03-15 5 268. 287. 47.8 243 243 270 400
#> 6 2020-03-16 6 301 317. 60.2 243 269 314 436
#> 7 2020-03-17 7 330. 344 74.8 236 322 394 476
#> 8 2020-03-18 8 366 381. 98.6 238 337 451 561
#> 9 2020-03-19 9 422 418. 121. 258 258 423 639
#> 10 2020-03-20 10 446. 466. 145. 258 427 601 742
#> # … with 11 more rows
- Plot the forecast against observed case data
plot_forecast(summarised_case_forecast, EpiSoon::example_obs_cases)
Use iterative fitting to explore a forecast
- To explore the quality of a models forecast it can help to
iteratively forecast from each available data point. This is supported
in
EpiSoon
using the following:
it_rt_forecast <- iterative_rt_forecast(EpiSoon::example_obs_rts,
model = function(...) {
EpiSoon::bsts_model(model = function(ss, y) {
bsts::AddAutoAr(ss, y = y, lags = 10)
}, ...)
},
horizon = 7, samples = 10, min_points = 4
)
it_rt_forecast
#> # A tibble: 1,260 × 5
#> forecast_date sample date rt horizon
#> <chr> <int> <date> <dbl> <int>
#> 1 2020-03-05 1 2020-03-06 2.17 1
#> 2 2020-03-05 2 2020-03-06 2.35 1
#> 3 2020-03-05 3 2020-03-06 2.23 1
#> 4 2020-03-05 4 2020-03-06 2.31 1
#> 5 2020-03-05 5 2020-03-06 2.20 1
#> 6 2020-03-05 6 2020-03-06 2.17 1
#> 7 2020-03-05 7 2020-03-06 2.28 1
#> 8 2020-03-05 8 2020-03-06 2.24 1
#> 9 2020-03-05 9 2020-03-06 2.10 1
#> 10 2020-03-05 10 2020-03-06 2.17 1
#> # … with 1,250 more rows
- We can then iteratively forecast cases using the following:
it_cases_forecast <- iterative_case_forecast(
it_fit_samples = it_rt_forecast,
cases = EpiSoon::example_obs_cases,
serial_interval = EpiSoon::example_serial_interval
)
it_cases_forecast
#> forecast_date sample date cases horizon
#> 1: 2020-03-05 1 2020-03-06 72 1
#> 2: 2020-03-05 1 2020-03-07 110 2
#> 3: 2020-03-05 1 2020-03-08 122 3
#> 4: 2020-03-05 1 2020-03-09 156 4
#> 5: 2020-03-05 1 2020-03-10 155 5
#> ---
#> 1256: 2020-03-22 10 2020-03-25 594 3
#> 1257: 2020-03-22 10 2020-03-26 634 4
#> 1258: 2020-03-22 10 2020-03-27 747 5
#> 1259: 2020-03-22 10 2020-03-28 761 6
#> 1260: 2020-03-22 10 2020-03-29 813 7
- All functionality shown above is also supported for iterative forecasting.
Evaluate a model
In real world use we are likely to want to evaluate a model by
iteratively forecasting Rts and cases, summarising these forecasts,
scoring them and then returning them in a sensible format. These steps
are all contained in the evaluate_model
function.
model_eval <- evaluate_model(EpiSoon::example_obs_rts,
EpiSoon::example_obs_cases,
model = function(...) {
EpiSoon::bsts_model(model = function(ss, y) {
bsts::AddAutoAr(ss, y = y, lags = 10)
}, ...)
},
horizon = 21, samples = 10,
serial_interval = EpiSoon::example_serial_interval
)
model_eval
#> $forecast_rts
#> # A tibble: 399 × 10
#> forecast_date date horizon median mean sd bottom lower upper top
#> <chr> <date> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2020-03-04 2020-03-05 1 2.31 2.31 0.0582 2.24 2.29 2.33 2.45
#> 2 2020-03-04 2020-03-06 2 2.22 2.25 0.0994 2.16 2.20 2.26 2.51
#> 3 2020-03-04 2020-03-07 3 2.19 2.21 0.154 2.01 2.18 2.24 2.61
#> 4 2020-03-04 2020-03-08 4 2.14 2.17 0.184 2.04 2.12 2.17 2.68
#> 5 2020-03-04 2020-03-09 5 2.08 2.14 0.266 1.91 2.04 2.13 2.87
#> 6 2020-03-04 2020-03-10 6 2.00 2.10 0.314 1.88 1.88 2.01 2.96
#> 7 2020-03-04 2020-03-11 7 1.91 2.05 0.367 1.79 1.79 1.92 3.04
#> 8 2020-03-04 2020-03-12 8 1.84 2.02 0.438 1.72 1.72 1.85 3.22
#> 9 2020-03-04 2020-03-13 9 1.80 1.96 0.502 1.60 1.60 1.84 3.32
#> 10 2020-03-04 2020-03-14 10 1.75 1.95 0.547 1.61 1.61 1.77 3.45
#> # … with 389 more rows
#>
#> $rt_scores
#> forecast_date date horizon mad bias dss crps
#> 1: 2020-03-04 2020-03-05 1 0.02621350 0.4 -5.183370 0.02373664
#> 2: 2020-03-04 2020-03-06 2 0.04556057 0.6 -4.372989 0.02210787
#> 3: 2020-03-04 2020-03-07 3 0.05361625 0.6 -3.628796 0.03283926
#> 4: 2020-03-04 2020-03-08 4 0.04003312 0.4 -3.328042 0.02545941
#> 5: 2020-03-04 2020-03-09 5 0.07540549 0.2 -2.661907 0.02600294
#> ---
#> 167: 2020-03-19 2020-03-21 2 0.07523619 -1.0 -3.362381 0.05967052
#> 168: 2020-03-19 2020-03-22 3 0.06423035 -0.2 -5.308336 0.01954890
#> 169: 2020-03-20 2020-03-21 1 0.04090657 -1.0 -1.908536 0.05862092
#> 170: 2020-03-20 2020-03-22 2 0.02584708 0.4 -6.131101 0.01158014
#> 171: 2020-03-21 2020-03-22 1 0.04048833 1.0 -4.682066 0.03134641
#> log_score ae_median se_mean
#> 1: -1.7307525 0.04410196 1.860609e-03
#> 2: -1.8427888 0.03244357 3.103754e-03
#> 3: -1.4145811 0.04051442 4.634116e-03
#> 4: -1.6322361 0.03346696 4.958257e-03
#> 5: -1.5099326 0.01657254 5.892921e-03
#> ---
#> 167: -1.1293926 0.09808693 1.003378e-02
#> 168: -1.5256306 0.01518305 5.559022e-07
#> 169: -0.9157406 0.08765942 6.213299e-03
#> 170: -2.1144237 0.01294442 4.564807e-07
#> 171: -1.6940521 0.04758662 3.093697e-03
#>
#> $forecast_cases
#> # A tibble: 171 × 10
#> forecast_date date horizon median mean sd bottom lower upper top
#> <chr> <date> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2020-03-04 2020-03-05 1 73 71 10.1 56 72 83 83
#> 2 2020-03-04 2020-03-06 2 88 87.6 7.85 69 84 91 97
#> 3 2020-03-04 2020-03-07 3 103 103. 14.5 81 96 110 134
#> 4 2020-03-04 2020-03-08 4 130. 130 16.5 101 129 145 157
#> 5 2020-03-04 2020-03-09 5 148 152. 26.9 114 142 153 221
#> 6 2020-03-04 2020-03-10 6 180. 180. 44.0 121 155 182 290
#> 7 2020-03-04 2020-03-11 7 207 222. 78.0 128 189 220 427
#> 8 2020-03-04 2020-03-12 8 238. 262. 118. 142 217 251 577
#> 9 2020-03-04 2020-03-13 9 286. 327. 189. 162 256 325 844
#> 10 2020-03-04 2020-03-14 10 316. 385. 275. 206 276 369 1154
#> # … with 161 more rows
#>
#> $case_scores
#> sample forecast_date date horizon mad bias dss crps
#> 1: 1 2020-03-04 2020-03-05 1 11.1195 0.4 5.215464 5.86
#> 2: 1 2020-03-04 2020-03-06 2 5.1891 0.8 7.860179 11.54
#> 3: 1 2020-03-04 2020-03-07 3 10.3782 0.8 6.448490 9.07
#> 4: 1 2020-03-04 2020-03-08 4 17.7912 0.8 8.704857 19.44
#> 5: 1 2020-03-04 2020-03-09 5 8.8956 0.8 8.512762 25.26
#> ---
#> 167: 1 2020-03-19 2020-03-21 2 23.7216 -1.0 29.415136 105.94
#> 168: 1 2020-03-19 2020-03-22 3 39.2889 1.0 34.710048 162.49
#> 169: 1 2020-03-20 2020-03-21 1 34.8411 -1.0 28.104917 100.01
#> 170: 1 2020-03-20 2020-03-22 2 24.4629 1.0 57.013769 172.20
#> 171: 1 2020-03-21 2020-03-22 1 27.4281 1.0 60.975803 202.55
#> ae_median se_mean
#> 1: 10.0 64.00
#> 2: 15.0 213.16
#> 3: 15.0 228.01
#> 4: 28.5 784.00
#> 5: 32.0 1324.96
#> ---
#> 167: 121.5 14232.49
#> 168: 185.0 33124.00
#> 169: 119.0 12950.44
#> 170: 198.5 34856.89
#> 171: 223.0 47961.00
- All functionality outlined above can be applied to this output but a
special plotting function (
plot_forecast_evaluation
) is also provided. First evaluate the Rt forecast against observed values.
plot_forecast_evaluation(model_eval$forecast_rts,
EpiSoon::example_obs_rts,
horizon_to_plot = 7
)
- Then evaluate forecast cases against observed values.
plot_forecast_evaluation(model_eval$forecast_cases,
EpiSoon::example_obs_cases,
horizon_to_plot = 7
)
Wrapper functions
EpiSoon
provides several wrapper functions
(compare_models
and compare_timeseries
). These
both wrap evaluate_model
and can be used to rapidly explore
several forecasting models (compare_models
) against
multiple time series (compare_timeseries
). All lower level
summary and plotting functions can be then used with the output of these
wrappers to explore the results. See the function documentation for
further details.
Supporting generic modelling packages
EpiSoon
supports the use of generic forecasting models
if they are used in a wrapper that accepts a standardised set of inputs
and outputs its forecast in the form the package expects. Examples of
model wrappers are those for the bsts
and
fable
packages (bsts_model
and
fable_model
). See the examples and documentation for
fable_model
for further details.