[Maturing] Uses a non-parametric approach to reconstruct cases by date of infection from reported cases. It uses either a generative Rt model or non-parametric back calculation to estimate underlying latent infections and then maps these infections to observed cases via uncertain reporting delays and a flexible observation model. See the examples and function arguments for the details of all options. The default settings may not be sufficient for your use case so the number of warmup samples (stan_args = list(warmup)) may need to be increased as may the overall number of samples. Follow the links provided by any warnings messages to diagnose issues with the MCMC fit. It is recommended to explore several of the Rt estimation approaches supported as not all of them may be suited to users own use cases. See here for an example of using estimate_infections within the epinow wrapper to estimate Rt for Covid-19 in a country from the ECDC data source.

estimate_infections(
  reported_cases,
  generation_time,
  delays = delay_opts(),
  truncation = trunc_opts(),
  rt = rt_opts(),
  backcalc = backcalc_opts(),
  gp = gp_opts(),
  obs = obs_opts(),
  stan = stan_opts(),
  horizon = 7,
  CrIs = c(0.2, 0.5, 0.9),
  zero_threshold = 50,
  id = "estimate_infections",
  verbose = interactive()
)

Arguments

reported_cases

A data frame of confirmed cases (confirm) by date (date). confirm must be integer and date must be in date format.

generation_time

A list containing the mean, standard deviation of the mean (mean_sd), standard deviation (sd), standard deviation of the standard deviation and the maximum allowed value for the generation time (assuming a gamma distribution).

delays

A call to delay_opts() defining delay distributions and options. See the documentation of delay_opts() and the examples below for details.

truncation

[Experimental] A list of options as generated by trunc_opts() defining the truncation of observed data. Defaults to trunc_opts(). See estimate_truncation() for an approach to estimating truncation from data.

rt

A list of options as generated by rt_opts() defining Rt estimation. Defaults to rt_opts(). Set to NULL to switch to using back calculation rather than generating infections using Rt.

backcalc

A list of options as generated by backcalc_opts() to define the back calculation. Defaults to backcalc_opts().

gp

A list of options as generated by gp_opts() to define the Gaussian process. Defaults to gp_opts().Set to NULL to disable the Gaussian process.

obs

A list of options as generated by obs_opts() defining the observation model. Defaults to obs_opts().

stan

A list of stan options as generated by stan_opts(). Defaults to stan_opts(). Can be used to override data, init, and verbose settings if desired.

horizon

Numeric, defaults to 7. Number of days into the future to forecast.

CrIs

Numeric vector of credible intervals to calculate.

zero_threshold

[Experimental] Numeric defaults to 50. Indicates if detected zero cases are meaningful by using a threshold of 50 cases on average over the last 7 days. If the average is above this threshold then the zero is replaced with the backwards looking rolling average. If set to infinity then no changes are made.

id

A character string used to assign logging information on error. Used by regional_epinow to assign errors to regions. Alter the default to run with error catching.

verbose

Logical, defaults to TRUE when used interactively and otherwise FALSE. Should verbose debug progress messages be printed. Corresponds to the "DEBUG" level from futile.logger. See setup_logging for more detailed logging options.

See also

epinow regional_epinow forecast_infections simulate_infections

Examples

# \donttest{
# set number of cores to use
options(mc.cores = ifelse(interactive(), 4, 1))
# get example case counts
reported_cases <- example_confirmed[1:60]

# set up example generation time
generation_time <- get_generation_time(disease = "SARS-CoV-2", source = "ganyani")
# set delays between infection and case report
incubation_period <- get_incubation_period(disease = "SARS-CoV-2", source = "lauer")
reporting_delay <- list(
  mean = convert_to_logmean(2, 1), mean_sd = 0.1,
  sd = convert_to_logsd(2, 1), sd_sd = 0.1, max = 10
)

# default setting
# here we assume that the observed data is truncated by the same delay as
def <- estimate_infections(reported_cases,
  generation_time = generation_time,
  delays = delay_opts(incubation_period, reporting_delay),
  rt = rt_opts(prior = list(mean = 2, sd = 0.1)),
  stan = stan_opts(control = list(adapt_delta = 0.95))
)
# real time estimates
summary(def)
#>                                  measure               estimate
#> 1: New confirmed cases by infection date    2230 (1139 -- 3993)
#> 2:        Expected change in daily cases      Likely decreasing
#> 3:            Effective reproduction no.     0.87 (0.61 -- 1.1)
#> 4:                        Rate of growth -0.037 (-0.11 -- 0.04)
#> 5:          Doubling/halving time (days)       -19 (17 -- -6.1)
# summary plot
plot(def)


# decreasing the accuracy of the approximate Gaussian to speed up computation.
# These settings are an area of active research. See ?gp_opts for details.
agp <- estimate_infections(reported_cases,
  generation_time = generation_time,
  delays = delay_opts(incubation_period, reporting_delay),
  rt = rt_opts(prior = list(mean = 2, sd = 0.1)),
  gp = gp_opts(ls_min = 10, basis_prop = 0.1),
  stan = stan_opts(control = list(adapt_delta = 0.95))
)
summary(agp)
#>                                  measure                estimate
#> 1: New confirmed cases by infection date     2306 (1151 -- 4185)
#> 2:        Expected change in daily cases       Likely decreasing
#> 3:            Effective reproduction no.      0.88 (0.61 -- 1.2)
#> 4:                        Rate of growth -0.032 (-0.12 -- 0.041)
#> 5:          Doubling/halving time (days)          -21 (17 -- -6)
plot(agp)


# Adjusting for future susceptible depletion
dep <- estimate_infections(reported_cases,
  generation_time = generation_time,
  delays = delay_opts(incubation_period, reporting_delay),
  rt = rt_opts(
    prior = list(mean = 2, sd = 0.1),
    pop = 1000000, future = "latest"
  ),
  gp = gp_opts(ls_min = 10, basis_prop = 0.1), horizon = 21,
  stan = stan_opts(control = list(adapt_delta = 0.95))
)
plot(dep)


# Adjusting for truncation of the most recent data
# See estimate_truncation for an approach to estimating this from data
trunc_dist <- list(
  mean = convert_to_logmean(0.5, 0.5), mean_sd = 0.1,
  sd = convert_to_logsd(0.5, 0.5), sd_sd = 0.1,
  max = 3
)
trunc <- estimate_infections(reported_cases,
  generation_time = generation_time,
  delays = delay_opts(incubation_period, reporting_delay),
  truncation = trunc_opts(trunc_dist),
  rt = rt_opts(prior = list(mean = 2, sd = 0.1)),
  gp = gp_opts(ls_min = 10, basis_prop = 0.1),
  stan = stan_opts(control = list(adapt_delta = 0.95))
)
plot(trunc)


# using back calculation (combined here with under reporting)
# this model is in the order of 10 ~ 100 faster than the gaussian process method
# it is likely robust for retrospective Rt but less reliable for real time estimates
# the width of the prior window controls the reliance on observed data and can be
# optionally switched off using backcalc_opts(prior = "none"), see ?backcalc_opts for
# other options
backcalc <- estimate_infections(reported_cases,
  generation_time = generation_time,
  delays = delay_opts(incubation_period, reporting_delay),
  rt = NULL, backcalc = backcalc_opts(),
  obs = obs_opts(scale = list(mean = 0.4, sd = 0.05)),
  horizon = 0
)
plot(backcalc)


# Rt projected into the future using the Gaussian process
project_rt <- estimate_infections(reported_cases,
  generation_time = generation_time,
  delays = delay_opts(incubation_period, reporting_delay),
  rt = rt_opts(
    prior = list(mean = 2, sd = 0.1),
    future = "project"
  )
)
plot(project_rt)


# default settings on a later snapshot of data
snapshot_cases <- example_confirmed[80:130]
snapshot <- estimate_infections(snapshot_cases,
  generation_time = generation_time,
  delays = delay_opts(incubation_period, reporting_delay),
  rt = rt_opts(prior = list(mean = 1, sd = 0.1))
)
plot(snapshot)


# stationary Rt assumption (likely to provide biased real-time estimates)
stat <- estimate_infections(reported_cases,
  generation_time = generation_time,
  delays = delay_opts(incubation_period, reporting_delay),
  rt = rt_opts(prior = list(mean = 2, sd = 0.1), gp_on = "R0")
)
plot(stat)


# no gaussian process (i.e fixed Rt assuming no breakpoints)
fixed <- estimate_infections(reported_cases,
  generation_time = generation_time,
  delays = delay_opts(incubation_period, reporting_delay),
  gp = NULL
)
plot(fixed)


# no delays
no_delay <- estimate_infections(reported_cases, generation_time = generation_time)
plot(no_delay)


# break point but otherwise static Rt
bp_cases <- data.table::copy(reported_cases)
bp_cases <- bp_cases[, breakpoint := ifelse(date == as.Date("2020-03-16"), 1, 0)]
bkp <- estimate_infections(bp_cases,
  generation_time = generation_time,
  delays = delay_opts(incubation_period, reporting_delay),
  rt = rt_opts(prior = list(mean = 2, sd = 0.1)),
  gp = NULL
)
# break point effect
summary(bkp, type = "parameters", params = "breakpoints")
#>    date    variable strat type     median       mean         sd lower_90
#> 1: <NA> breakpoints     1 <NA> -0.6606923 -0.6608357 0.02712925 -0.70757
#>      lower_50   lower_20   upper_20   upper_50   upper_90
#> 1: -0.6786426 -0.6672251 -0.6538984 -0.6418164 -0.6170302
plot(bkp)


# weekly random walk
rw <- estimate_infections(reported_cases,
  generation_time = generation_time,
  delays = delay_opts(incubation_period, reporting_delay),
  rt = rt_opts(prior = list(mean = 2, sd = 0.1), rw = 7),
  gp = NULL
)

# random walk effects
summary(rw, type = "parameters", params = "breakpoints")
#>    date    variable strat type      median        mean         sd   lower_90
#> 1: <NA> breakpoints     1 <NA> -0.13409608 -0.13314217 0.06779037 -0.2400401
#> 2: <NA> breakpoints     2 <NA> -0.14472360 -0.14478126 0.07749732 -0.2692223
#> 3: <NA> breakpoints     3 <NA> -0.21479129 -0.21758278 0.08211905 -0.3580389
#> 4: <NA> breakpoints     4 <NA> -0.28827624 -0.29315359 0.09051555 -0.4500667
#> 5: <NA> breakpoints     5 <NA> -0.05795192 -0.05901787 0.09633081 -0.2118880
#> 6: <NA> breakpoints     6 <NA>  0.04884797  0.04769352 0.09340554 -0.1022220
#> 7: <NA> breakpoints     7 <NA> -0.06700372 -0.06862683 0.10548072 -0.2429219
#> 8: <NA> breakpoints     8 <NA> -0.01835025 -0.02120007 0.17050696 -0.2971596
#>       lower_50    lower_20    upper_20     upper_50    upper_90
#> 1: -0.18080498 -0.15215844 -0.11665456 -0.088244270 -0.02161150
#> 2: -0.19556506 -0.16275740 -0.12661684 -0.095132252 -0.01761154
#> 3: -0.26967484 -0.23552996 -0.19710036 -0.163959202 -0.08184366
#> 4: -0.35104344 -0.31178402 -0.26863997 -0.230931653 -0.15097147
#> 5: -0.12692648 -0.08581198 -0.03441650  0.004459652  0.09611647
#> 6: -0.01329567  0.02435766  0.07069061  0.106418507  0.20341203
#> 7: -0.13490572 -0.09270474 -0.04267370  0.001941989  0.09928655
#> 8: -0.13574712 -0.05991371  0.02440390  0.089000369  0.25225013
plot(rw)

# }