`R/estimate_infections.R`

`estimate_infections.Rd`

Uses a non-parametric approach to reconstruct cases by date of infection from reported
cases. It uses either a generative Rt model or non-parametric back calculation to estimate underlying
latent infections and then maps these infections to observed cases via uncertain reporting delays and a flexible
observation model. See the examples and function arguments for the details of all options. The default settings
may not be sufficient for your use case so the number of warmup samples (`stan_args = list(warmup)`

) may need to
be increased as may the overall number of samples. Follow the links provided by any warnings messages to diagnose
issues with the MCMC fit. It is recommended to explore several of the Rt estimation approaches supported as not all
of them may be suited to users own use cases. See here
for an example of using `estimate_infections`

within the `epinow`

wrapper to estimate Rt for Covid-19 in a country from
the ECDC data source.

```
estimate_infections(
reported_cases,
generation_time,
delays = delay_opts(),
truncation = trunc_opts(),
rt = rt_opts(),
backcalc = backcalc_opts(),
gp = gp_opts(),
obs = obs_opts(),
stan = stan_opts(),
horizon = 7,
CrIs = c(0.2, 0.5, 0.9),
zero_threshold = 50,
id = "estimate_infections",
verbose = interactive()
)
```

reported_cases | A data frame of confirmed cases (confirm) by date (date). confirm must be integer and date must be in date format. |
---|---|

generation_time | A list containing the mean, standard deviation of the mean (mean_sd), standard deviation (sd), standard deviation of the standard deviation and the maximum allowed value for the generation time (assuming a gamma distribution). |

delays | A call to |

truncation | A list of options as generated by |

rt | A list of options as generated by |

backcalc | A list of options as generated by |

gp | A list of options as generated by |

obs | A list of options as generated by |

stan | A list of stan options as generated by |

horizon | Numeric, defaults to 7. Number of days into the future to forecast. |

CrIs | Numeric vector of credible intervals to calculate. |

zero_threshold | Numeric defaults to 50. Indicates if detected zero cases are meaningful by using a threshold of 50 cases on average over the last 7 days. If the average is above this threshold then the zero is replaced with the backwards looking rolling average. If set to infinity then no changes are made. |

id | A character string used to assign logging information on error. Used by |

verbose | Logical, defaults to |

epinow regional_epinow forecast_infections simulate_infections

```
# \donttest{
# set number of cores to use
options(mc.cores = ifelse(interactive(), 4, 1))
# get example case counts
reported_cases <- example_confirmed[1:60]
# set up example generation time
generation_time <- get_generation_time(disease = "SARS-CoV-2", source = "ganyani")
# set delays between infection and case report
incubation_period <- get_incubation_period(disease = "SARS-CoV-2", source = "lauer")
reporting_delay <- list(
mean = convert_to_logmean(2, 1), mean_sd = 0.1,
sd = convert_to_logsd(2, 1), sd_sd = 0.1, max = 10
)
# default setting
# here we assume that the observed data is truncated by the same delay as
def <- estimate_infections(reported_cases,
generation_time = generation_time,
delays = delay_opts(incubation_period, reporting_delay),
rt = rt_opts(prior = list(mean = 2, sd = 0.1)),
stan = stan_opts(control = list(adapt_delta = 0.95))
)
# real time estimates
summary(def)
#> measure estimate
#> 1: New confirmed cases by infection date 2230 (1139 -- 3993)
#> 2: Expected change in daily cases Likely decreasing
#> 3: Effective reproduction no. 0.87 (0.61 -- 1.1)
#> 4: Rate of growth -0.037 (-0.11 -- 0.04)
#> 5: Doubling/halving time (days) -19 (17 -- -6.1)
# summary plot
plot(def)
# decreasing the accuracy of the approximate Gaussian to speed up computation.
# These settings are an area of active research. See ?gp_opts for details.
agp <- estimate_infections(reported_cases,
generation_time = generation_time,
delays = delay_opts(incubation_period, reporting_delay),
rt = rt_opts(prior = list(mean = 2, sd = 0.1)),
gp = gp_opts(ls_min = 10, basis_prop = 0.1),
stan = stan_opts(control = list(adapt_delta = 0.95))
)
summary(agp)
#> measure estimate
#> 1: New confirmed cases by infection date 2306 (1151 -- 4185)
#> 2: Expected change in daily cases Likely decreasing
#> 3: Effective reproduction no. 0.88 (0.61 -- 1.2)
#> 4: Rate of growth -0.032 (-0.12 -- 0.041)
#> 5: Doubling/halving time (days) -21 (17 -- -6)
plot(agp)
# Adjusting for future susceptible depletion
dep <- estimate_infections(reported_cases,
generation_time = generation_time,
delays = delay_opts(incubation_period, reporting_delay),
rt = rt_opts(
prior = list(mean = 2, sd = 0.1),
pop = 1000000, future = "latest"
),
gp = gp_opts(ls_min = 10, basis_prop = 0.1), horizon = 21,
stan = stan_opts(control = list(adapt_delta = 0.95))
)
plot(dep)
# Adjusting for truncation of the most recent data
# See estimate_truncation for an approach to estimating this from data
trunc_dist <- list(
mean = convert_to_logmean(0.5, 0.5), mean_sd = 0.1,
sd = convert_to_logsd(0.5, 0.5), sd_sd = 0.1,
max = 3
)
trunc <- estimate_infections(reported_cases,
generation_time = generation_time,
delays = delay_opts(incubation_period, reporting_delay),
truncation = trunc_opts(trunc_dist),
rt = rt_opts(prior = list(mean = 2, sd = 0.1)),
gp = gp_opts(ls_min = 10, basis_prop = 0.1),
stan = stan_opts(control = list(adapt_delta = 0.95))
)
plot(trunc)
# using back calculation (combined here with under reporting)
# this model is in the order of 10 ~ 100 faster than the gaussian process method
# it is likely robust for retrospective Rt but less reliable for real time estimates
# the width of the prior window controls the reliance on observed data and can be
# optionally switched off using backcalc_opts(prior = "none"), see ?backcalc_opts for
# other options
backcalc <- estimate_infections(reported_cases,
generation_time = generation_time,
delays = delay_opts(incubation_period, reporting_delay),
rt = NULL, backcalc = backcalc_opts(),
obs = obs_opts(scale = list(mean = 0.4, sd = 0.05)),
horizon = 0
)
plot(backcalc)
# Rt projected into the future using the Gaussian process
project_rt <- estimate_infections(reported_cases,
generation_time = generation_time,
delays = delay_opts(incubation_period, reporting_delay),
rt = rt_opts(
prior = list(mean = 2, sd = 0.1),
future = "project"
)
)
plot(project_rt)
# default settings on a later snapshot of data
snapshot_cases <- example_confirmed[80:130]
snapshot <- estimate_infections(snapshot_cases,
generation_time = generation_time,
delays = delay_opts(incubation_period, reporting_delay),
rt = rt_opts(prior = list(mean = 1, sd = 0.1))
)
plot(snapshot)
# stationary Rt assumption (likely to provide biased real-time estimates)
stat <- estimate_infections(reported_cases,
generation_time = generation_time,
delays = delay_opts(incubation_period, reporting_delay),
rt = rt_opts(prior = list(mean = 2, sd = 0.1), gp_on = "R0")
)
plot(stat)
# no gaussian process (i.e fixed Rt assuming no breakpoints)
fixed <- estimate_infections(reported_cases,
generation_time = generation_time,
delays = delay_opts(incubation_period, reporting_delay),
gp = NULL
)
plot(fixed)
# no delays
no_delay <- estimate_infections(reported_cases, generation_time = generation_time)
plot(no_delay)
# break point but otherwise static Rt
bp_cases <- data.table::copy(reported_cases)
bp_cases <- bp_cases[, breakpoint := ifelse(date == as.Date("2020-03-16"), 1, 0)]
bkp <- estimate_infections(bp_cases,
generation_time = generation_time,
delays = delay_opts(incubation_period, reporting_delay),
rt = rt_opts(prior = list(mean = 2, sd = 0.1)),
gp = NULL
)
# break point effect
summary(bkp, type = "parameters", params = "breakpoints")
#> date variable strat type median mean sd lower_90
#> 1: <NA> breakpoints 1 <NA> -0.6606923 -0.6608357 0.02712925 -0.70757
#> lower_50 lower_20 upper_20 upper_50 upper_90
#> 1: -0.6786426 -0.6672251 -0.6538984 -0.6418164 -0.6170302
plot(bkp)
# weekly random walk
rw <- estimate_infections(reported_cases,
generation_time = generation_time,
delays = delay_opts(incubation_period, reporting_delay),
rt = rt_opts(prior = list(mean = 2, sd = 0.1), rw = 7),
gp = NULL
)
# random walk effects
summary(rw, type = "parameters", params = "breakpoints")
#> date variable strat type median mean sd lower_90
#> 1: <NA> breakpoints 1 <NA> -0.13409608 -0.13314217 0.06779037 -0.2400401
#> 2: <NA> breakpoints 2 <NA> -0.14472360 -0.14478126 0.07749732 -0.2692223
#> 3: <NA> breakpoints 3 <NA> -0.21479129 -0.21758278 0.08211905 -0.3580389
#> 4: <NA> breakpoints 4 <NA> -0.28827624 -0.29315359 0.09051555 -0.4500667
#> 5: <NA> breakpoints 5 <NA> -0.05795192 -0.05901787 0.09633081 -0.2118880
#> 6: <NA> breakpoints 6 <NA> 0.04884797 0.04769352 0.09340554 -0.1022220
#> 7: <NA> breakpoints 7 <NA> -0.06700372 -0.06862683 0.10548072 -0.2429219
#> 8: <NA> breakpoints 8 <NA> -0.01835025 -0.02120007 0.17050696 -0.2971596
#> lower_50 lower_20 upper_20 upper_50 upper_90
#> 1: -0.18080498 -0.15215844 -0.11665456 -0.088244270 -0.02161150
#> 2: -0.19556506 -0.16275740 -0.12661684 -0.095132252 -0.01761154
#> 3: -0.26967484 -0.23552996 -0.19710036 -0.163959202 -0.08184366
#> 4: -0.35104344 -0.31178402 -0.26863997 -0.230931653 -0.15097147
#> 5: -0.12692648 -0.08581198 -0.03441650 0.004459652 0.09611647
#> 6: -0.01329567 0.02435766 0.07069061 0.106418507 0.20341203
#> 7: -0.13490572 -0.09270474 -0.04267370 0.001941989 0.09928655
#> 8: -0.13574712 -0.05991371 0.02440390 0.089000369 0.25225013
plot(rw)
# }
```