Estimate Infections, the Time-Varying Reproduction Number and the Rate of Growth
Source:R/estimate_infections.R
estimate_infections.Rd
Uses a non-parametric approach to reconstruct cases by date of infection
from reported cases. It uses either a generative Rt model or non-parametric
back calculation to estimate underlying latent infections and then maps
these infections to observed cases via uncertain reporting delays and a
flexible observation model. See the examples and function arguments for the
details of all options. The default settings may not be sufficient for your
use case so the number of warmup samples (
stan_args = list(warmup)
) may
need to be increased as may the overall number of samples. Follow the links
provided by any warnings messages to diagnose issues with the MCMC fit. It
is recommended to explore several of the Rt estimation approaches supported
as not all of them may be suited to users own use cases. See
here
for an example of using estimate_infections
within the epinow
wrapper to
estimate Rt for Covid-19 in a country from the ECDC data source.
Usage
estimate_infections(
reported_cases,
generation_time = generation_time_opts(),
delays = delay_opts(),
truncation = trunc_opts(),
rt = rt_opts(),
backcalc = backcalc_opts(),
gp = gp_opts(),
obs = obs_opts(),
stan = stan_opts(),
horizon = 7,
CrIs = c(0.2, 0.5, 0.9),
filter_leading_zeros = TRUE,
zero_threshold = Inf,
weigh_delay_priors = TRUE,
id = "estimate_infections",
verbose = interactive()
)
Arguments
- reported_cases
A data frame of confirmed cases (confirm) by date (date). confirm must be integer and date must be in date format.
- generation_time
A call to
generation_time_opts()
defining the generation time distribution used. For backwards compatibility a list of summary parameters can also be passed.- delays
A call to
delay_opts()
defining delay distributions and options. See the documentation ofdelay_opts()
and the examples below for details.- truncation
A call to
trunc_opts()
defining the truncation of observed data. Defaults totrunc_opts()
. Seeestimate_truncation()
for an approach to estimating truncation from data.- rt
A list of options as generated by
rt_opts()
defining Rt estimation. Defaults tort_opts()
. Set toNULL
to switch to using back calculation rather than generating infections using Rt.- backcalc
A list of options as generated by
backcalc_opts()
to define the back calculation. Defaults tobackcalc_opts()
.- gp
A list of options as generated by
gp_opts()
to define the Gaussian process. Defaults togp_opts()
.Set to NULL to disable the Gaussian process.- obs
A list of options as generated by
obs_opts()
defining the observation model. Defaults toobs_opts()
.- stan
A list of stan options as generated by
stan_opts()
. Defaults tostan_opts()
. Can be used to overridedata
,init
, andverbose
settings if desired.- horizon
Numeric, defaults to 7. Number of days into the future to forecast.
- CrIs
Numeric vector of credible intervals to calculate.
- filter_leading_zeros
Logical, defaults to TRUE. Should zeros at the start of the time series be filtered out.
- zero_threshold
Numeric defaults to Inf. Indicates if detected zero cases are meaningful by using a threshold number of cases based on the 7 day average. If the average is above this threshold then the zero is replaced with the backwards looking rolling average. If set to infinity then no changes are made.
- weigh_delay_priors
Logical. If TRUE (default), all delay distribution priors will be weighted by the number of observation data points, in doing so approximately placing an independent prior at each time step and usually preventing the posteriors from shifting. If FALSE, no weight will be applied, i.e. delay distributions will be treated as a single parameters.
- id
A character string used to assign logging information on error. Used by
regional_epinow
to assign errors to regions. Alter the default to run with error catching.- verbose
Logical, defaults to
TRUE
when used interactively and otherwiseFALSE
. Should verbose debug progress messages be printed. Corresponds to the "DEBUG" level fromfutile.logger
. Seesetup_logging
for more detailed logging options.
Value
A list of output including: posterior samples, summarised posterior samples, data used to fit the model, and the fit object itself.
Examples
# \donttest{
# set number of cores to use
old_opts <- options()
options(mc.cores = ifelse(interactive(), 4, 1))
# get example case counts
reported_cases <- example_confirmed[1:60]
# set up example generation time
generation_time <- get_generation_time(
disease = "SARS-CoV-2", source = "ganyani", fixed = TRUE
)
# set delays between infection and case report
incubation_period <- get_incubation_period(
disease = "SARS-CoV-2", source = "lauer", fixed = TRUE
)
# delays between infection and case report, with uncertainty
incubation_period_uncertain <- get_incubation_period(
disease = "SARS-CoV-2", source = "lauer"
)
reporting_delay <- dist_spec(
mean = convert_to_logmean(2, 1), mean_sd = 0,
sd = convert_to_logsd(2, 1), sd_sd = 0, max = 10
)
# default settings but assuming that delays are fixed rather than uncertain
def <- estimate_infections(reported_cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(incubation_period + reporting_delay),
rt = rt_opts(prior = list(mean = 2, sd = 0.1)),
stan = stan_opts(control = list(adapt_delta = 0.95))
)
#> Warning: There were 3 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
# real time estimates
summary(def)
#> measure estimate
#> 1: New confirmed cases by infection date 2285 (1101 -- 4432)
#> 2: Expected change in daily cases Likely decreasing
#> 3: Effective reproduction no. 0.88 (0.61 -- 1.2)
#> 4: Rate of growth -0.026 (-0.098 -- 0.038)
#> 5: Doubling/halving time (days) -26 (18 -- -7.1)
# summary plot
plot(def)
# decreasing the accuracy of the approximate Gaussian to speed up
#computation.
# These settings are an area of active research. See ?gp_opts for details.
agp <- estimate_infections(reported_cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(incubation_period + reporting_delay),
rt = rt_opts(prior = list(mean = 2, sd = 0.1)),
gp = gp_opts(ls_min = 10, basis_prop = 0.1),
stan = stan_opts(control = list(adapt_delta = 0.95))
)
#> Warning: There were 6 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
summary(agp)
#> measure estimate
#> 1: New confirmed cases by infection date 2377 (1227 -- 4458)
#> 2: Expected change in daily cases Likely decreasing
#> 3: Effective reproduction no. 0.89 (0.64 -- 1.2)
#> 4: Rate of growth -0.024 (-0.089 -- 0.039)
#> 5: Doubling/halving time (days) -29 (18 -- -7.8)
plot(agp)
# Adjusting for future susceptible depletion
dep <- estimate_infections(reported_cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(incubation_period + reporting_delay),
rt = rt_opts(
prior = list(mean = 2, sd = 0.1),
pop = 1000000, future = "latest"
),
gp = gp_opts(ls_min = 10, basis_prop = 0.1), horizon = 21,
stan = stan_opts(control = list(adapt_delta = 0.95))
)
#> Warning: There were 4 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
plot(dep)
# Adjusting for truncation of the most recent data
# See estimate_truncation for an approach to estimating this from data
trunc_dist <- dist_spec(
mean = convert_to_logmean(0.5, 0.5), mean_sd = 0.1,
sd = convert_to_logsd(0.5, 0.5), sd_sd = 0.1,
max = 3
)
trunc <- estimate_infections(reported_cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(incubation_period + reporting_delay),
truncation = trunc_opts(trunc_dist),
rt = rt_opts(prior = list(mean = 2, sd = 0.1)),
gp = gp_opts(ls_min = 10, basis_prop = 0.1),
stan = stan_opts(control = list(adapt_delta = 0.95))
)
#> Warning: There were 9 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
plot(trunc)
# using back calculation (combined here with under reporting)
# this model is in the order of 10 ~ 100 faster than the gaussian process
# method
# it is likely robust for retrospective Rt but less reliable for real time
# estimates
# the width of the prior window controls the reliance on observed data and
# can be optionally switched off using backcalc_opts(prior = "none"),
# see ?backcalc_opts for other options
backcalc <- estimate_infections(reported_cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(incubation_period + reporting_delay),
rt = NULL, backcalc = backcalc_opts(),
obs = obs_opts(scale = list(mean = 0.4, sd = 0.05)),
horizon = 0
)
#> Warning: The following variables have undefined values: gt_rev_pmf[1],The following variables have undefined values: gt_rev_pmf[2],The following variables have undefined values: gt_rev_pmf[3],The following variables have undefined values: gt_rev_pmf[4],The following variables have undefined values: gt_rev_pmf[5],The following variables have undefined values: gt_rev_pmf[6],The following variables have undefined values: gt_rev_pmf[7],The following variables have undefined values: gt_rev_pmf[8],The following variables have undefined values: gt_rev_pmf[9],The following variables have undefined values: gt_rev_pmf[10],The following variables have undefined values: gt_rev_pmf[11],The following variables have undefined values: gt_rev_pmf[12],The following variables have undefined values: gt_rev_pmf[13],The following variables have undefined values: gt_rev_pmf[14],The following variables have undefined values: gt_rev_pmf[15]. Many subsequent functions will not work correctly.
plot(backcalc)
# Rt projected into the future using the Gaussian process
project_rt <- estimate_infections(reported_cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(incubation_period + reporting_delay),
rt = rt_opts(
prior = list(mean = 2, sd = 0.1),
future = "project"
)
)
#> Warning: There were 8 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
plot(project_rt)
# default settings on a later snapshot of data
snapshot_cases <- example_confirmed[80:130]
snapshot <- estimate_infections(snapshot_cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(incubation_period + reporting_delay),
rt = rt_opts(prior = list(mean = 1, sd = 0.1))
)
#> Warning: There were 17 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
plot(snapshot)
# stationary Rt assumption (likely to provide biased real-time estimates)
# with uncertain reporting delays
stat <- estimate_infections(reported_cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(incubation_period_uncertain + reporting_delay),
rt = rt_opts(prior = list(mean = 2, sd = 0.1), gp_on = "R0")
)
plot(stat)
# no gaussian process (i.e fixed Rt assuming no breakpoints)
# with uncertain reporting delays
fixed <- estimate_infections(reported_cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(incubation_period_uncertain + reporting_delay),
gp = NULL
)
plot(fixed)
# no delays
no_delay <- estimate_infections(
reported_cases,
generation_time = generation_time_opts(generation_time)
)
#> Warning: There were 33 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
plot(no_delay)
# break point but otherwise static Rt
# with uncertain reporting delays
bp_cases <- data.table::copy(reported_cases)
bp_cases <- bp_cases[,
breakpoint := ifelse(date == as.Date("2020-03-16"), 1, 0)
]
bkp <- estimate_infections(bp_cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(incubation_period_uncertain + reporting_delay),
rt = rt_opts(prior = list(mean = 2, sd = 0.1)),
gp = NULL
)
# break point effect
summary(bkp, type = "parameters", params = "breakpoints")
#> date variable strat type median mean sd lower_90
#> 1: <NA> breakpoints 1 <NA> -0.6727372 -0.6727021 0.02994277 -0.7205137
#> lower_50 lower_20 upper_20 upper_50 upper_90
#> 1: -0.6935414 -0.6800491 -0.6646493 -0.6517078 -0.6251641
plot(bkp)
# weekly random walk
# with uncertain reporting delays
rw <- estimate_infections(reported_cases,
generation_time = generation_time_opts(generation_time),
delays = delay_opts(incubation_period_uncertain + reporting_delay),
rt = rt_opts(prior = list(mean = 2, sd = 0.1), rw = 7),
gp = NULL
)
# random walk effects
summary(rw, type = "parameters", params = "breakpoints")
#> date variable strat type median mean sd lower_90
#> 1: <NA> breakpoints 1 <NA> -0.13434895 -0.13445464 0.07143227 -0.2510856
#> 2: <NA> breakpoints 2 <NA> -0.16754973 -0.16901524 0.08428734 -0.3099213
#> 3: <NA> breakpoints 3 <NA> -0.21291696 -0.21385606 0.08962088 -0.3577062
#> 4: <NA> breakpoints 4 <NA> -0.28130710 -0.28116723 0.09848509 -0.4444970
#> 5: <NA> breakpoints 5 <NA> -0.06520991 -0.06574470 0.09953010 -0.2347778
#> 6: <NA> breakpoints 6 <NA> 0.03790157 0.04066258 0.10240392 -0.1187247
#> 7: <NA> breakpoints 7 <NA> -0.05340166 -0.05351565 0.11713635 -0.2535568
#> 8: <NA> breakpoints 8 <NA> -0.02005072 -0.01872663 0.17220665 -0.2871967
#> lower_50 lower_20 upper_20 upper_50 upper_90
#> 1: -0.18405093 -0.15404417 -0.11712145 -0.084924048 -0.01575826
#> 2: -0.22323940 -0.18867072 -0.14860125 -0.114002708 -0.03101021
#> 3: -0.27313376 -0.23832056 -0.19327687 -0.156736751 -0.06269791
#> 4: -0.34324107 -0.30384546 -0.25414151 -0.213123551 -0.12764622
#> 5: -0.13321403 -0.08987899 -0.03826645 0.001080482 0.09518668
#> 6: -0.02739738 0.01071721 0.06522339 0.106367566 0.21324583
#> 7: -0.12597604 -0.08105306 -0.02189179 0.023872316 0.13462128
#> 8: -0.12876121 -0.06215569 0.02284186 0.091843829 0.26416713
plot(rw)
options(old_opts)
# }