estimate_infections.RdThis function uses a non-parametric approach to reconstruct cases by date of infection from reported cases. It can optionally then estimate the time-varying reproduction number and the rate of growth.
estimate_infections( reported_cases, model, samples = 1000, stan_args, method = "exact", family = "negbin", generation_time, delays, horizon = 7, gp = list(basis_prop = 0.3, boundary_scale = 2, lengthscale_mean = 0, lengthscale_sd = 2), rt_prior = list(mean = 1, sd = 1), estimate_rt = TRUE, estimate_week_eff = TRUE, estimate_breakpoints = FALSE, stationary = FALSE, fixed = FALSE, fixed_future_rt = FALSE, burn_in = 0, prior_smoothing_window = 7, future = FALSE, max_execution_time = Inf, return_fit = FALSE, verbose = FALSE )
| reported_cases | A data frame of confirmed cases (confirm) by date (date). confirm must be integer and date must be in date format. |
|---|---|
| model | A compiled stan model. By default uses the internal package model. |
| samples | Numeric, defaults to 1000. Number of samples post warmup. |
| stan_args | A list of stan arguments to be passed to |
| method | A character string defaults to "exact". Also accepts "approximate". Indicates the fitting method to be used this can either be "exact" (NUTs sampling) or "approximate" (variational inference). The exact approach returns samples from the posterior whilst the approximate method returns approximate samples. The approximate method is likely to return results several order of magnitudes faster than the exact method. |
| family | A character string indicating the reporting model to use. Defaults to negative binomial ("negbin") with poisson ("poisson") also supported. |
| generation_time | A list containing the mean, standard deviation of the mean (mean_sd), standard deviation (sd), standard deviation of the standard deviation and the maximum allowed value for the generation time (assuming a gamma distribution). |
| delays | A list of delays (i.e incubation period/reporting delay) between infection and report. Each list entry must also be a list containing the mean, standard deviation of the mean (mean_sd), standard deviation (sd), standard deviation of the standard deviation and the maximum allowed value for the that delay (assuming a lognormal distribution with all parameters excepting the max allowed value on the log scale). |
| horizon | Numeric, defaults to 7. Number of days into the future to forecast. |
| gp | List controlling the Gaussian process approximation. Must contain
the |
| rt_prior | A list contain the mean and standard deviation (sd) of the gamma distributed prior for Rt. By default this is assumed to be mean 1 with a standard deviation of 1. |
| estimate_rt | Logical, defaults TRUE. Should Rt be estimated when imputing infections. |
| estimate_week_eff | Logical, defaults TRUE. Should weekly reporting effects be estimated. |
| estimate_breakpoints | Logical, defaults to FALSE. Should breakpoints in Rt be estimated. If true then |
| stationary | Logical, defaults to FALSE. Should Rt be estimated with a global mean. When estimating Rt this should substantially improve run times but will revert to the global average for real time and forecasted estimates. This setting is most appropriate when estimating historic Rt or when combined with breakpoints. |
| fixed | Logical, defaults to FALSE. If TRUE then a Gaussian process is not used and Rt is assumed to be constant over time
(apart from any manually included breakpoints). If |
| fixed_future_rt | Logical, defaults to FALSE. IF TRUE then the estimated Rt from the last time point with data is used for all future time points without data. |
| burn_in | Numeric, defaults to 0. The number of initial estimates to discard. This argument may be used to reduce
spurious findings when running |
| prior_smoothing_window | Numeric defaults to 7. The number of days over which to take a rolling average for the prior based on reported cases. |
| future | Logical, defaults to |
| max_execution_time | Numeric, defaults to Inf. If set will kill off processing of each chain if not finished within the specified timeout. When more than 2 chains finish successfully estimates will still be returned. If less than 2 chains return within the allowed time then estimation will fail with an informative error. |
| return_fit | Logical, defaults to FALSE. Should the fitted stan model be returned. |
| verbose | Logical, defaults to |
# \donttest{ # Get example case counts reported_cases <- EpiNow2::example_confirmed[1:50] # Add a dummy breakpoint (used only when optionally estimating breakpoints) reported_cases <- reported_cases[, breakpoint := data.table::fifelse(date == as.Date("2020-03-16"), 1, 0)] # Set up example generation time generation_time <- get_generation_time(disease = "SARS-CoV-2", source = "ganyani") # Set delays between infection and case report incubation_period <- get_incubation_period(disease = "SARS-CoV-2", source = "lauer") reporting_delay <- list(mean = log(5), mean_sd = log(2), sd = log(2), sd_sd = log(1.5), max = 30) # Run model with default settings def <- estimate_infections(reported_cases, generation_time = generation_time, delays = list(incubation_period, reporting_delay), stan_args = list(warmup = 200, cores = ifelse(interactive(), 4, 1))) # Plot output plots <- report_plots(summarised_estimates = def$summarised, reported = reported_cases) plots$summary# Run the model using the approximate method (variational inference) approx <- estimate_infections(reported_cases, generation_time = generation_time, delays = list(incubation_period, reporting_delay), method = "approximate") # Plot output plots <- report_plots(summarised_estimates = approx$summarised, reported = reported_cases) plots$summary# Run the model with default settings using the future backend ## (combine with a call to future::plan to make this parallel). def_future <- estimate_infections(reported_cases, generation_time = generation_time, delays = list(incubation_period, reporting_delay), stan_args = list(warmup = 200, cores = ifelse(interactive(), 4, 1))) plots <- report_plots(summarised_estimates = def_future$summarised, reported = reported_cases) plots$summary# Run model with Rt fixed into the future fixed_rt <- estimate_infections(reported_cases, generation_time = generation_time, delays = list(incubation_period, reporting_delay), stan_args = list(warmup = 200, cores = ifelse(interactive(), 4, 1)), fixed_future_rt = TRUE) # Plot output plots <- report_plots(summarised_estimates = fixed_rt$summarised, reported = reported_cases) plots$summary# Run the model with default settings on a later snapshot of # data (use burn_in here to remove the first week of # estimates that may be impacted by this most). snapshot_cases <- EpiNow2::example_confirmed[80:130] snapshot <- estimate_infections(reported_cases, generation_time = generation_time, delays = list(incubation_period, reporting_delay), stan_args = list(warmup = 200, cores = ifelse(interactive(), 4, 1)), burn_in = 7) # Plot output plots <- report_plots(summarised_estimates = snapshot$summarised, reported = snapshot_cases) plots$summary## Run model with stationary Rt assumption (likely to provide biased real-time estimates) stat <- estimate_infections(reported_cases, generation_time = generation_time, delays = list(incubation_period, reporting_delay), stan_args = list(warmup = 200, cores = ifelse(interactive(), 4, 1)), stationary = TRUE) # Plot output plots <- report_plots(summarised_estimates = stat$summarised, reported = reported_cases) plots$summary# Run model with fixed Rt assumption fixed <- estimate_infections(reported_cases, generation_time = generation_time, delays = list(incubation_period, reporting_delay), stan_args = list(warmup = 200, cores = ifelse(interactive(), 4, 1)), fixed = TRUE) # Plot output plots <- report_plots(summarised_estimates = fixed$summarised, reported = reported_cases) plots$summary# Run model with no delays no_delay <- estimate_infections(reported_cases, generation_time = generation_time, stan_args = list(warmup = 200, cores = ifelse(interactive(), 4, 1))) # Plot output plots <- report_plots(summarised_estimates = no_delay$summarised, reported = reported_cases) plots$summary# Run model with breakpoints bkp <- estimate_infections(reported_cases, generation_time = generation_time, delays = list(incubation_period, reporting_delay), stan_args = list(warmup = 200, cores = ifelse(interactive(), 4, 1)), estimate_breakpoints = TRUE) # Plot output plots <- report_plots(summarised_estimates = bkp$summarised, reported = reported_cases) plots$summary# Run model with breakpoints but with constrained non-linear change over time # This formulation may increase the apparent effect of the breakpoint but needs to be tested using # model fit criteria (i.e LFO). cbkp <- estimate_infections(reported_cases, generation_time = generation_time, delays = list(incubation_period, reporting_delay), gp = list(basis_prop = 0.3, boundary_scale = 2, lengthscale_mean = 20, lengthscale_sd = 1), stan_args = list(warmup = 200, cores = ifelse(interactive(), 4, 1)), estimate_breakpoints = TRUE) # Plot output plots <- report_plots(summarised_estimates = cbkp$summarised, reported = reported_cases) plots$summary# Pull out breakpoint summary cbkp$summarised[variable == "breakpoints"]#> date variable strat type bottom top lower upper #> 1: <NA> breakpoints 1 <NA> 0.803358 1.10491 0.879774 0.9998565 #> central_lower central_upper median mean sd #> 1: 0.9333897 0.977326 0.9445308 0.9512294 0.09313243# Run model with breakpoints but otherwise static Rt # This formulation may increase the apparent effect of the breakpoint but needs to be tested using # model fit criteria (i.e LFO). fbkp <- estimate_infections(reported_cases, generation_time = generation_time, delays = list(incubation_period, reporting_delay), stan_args = list(warmup = 200, cores = ifelse(interactive(), 4, 1)), estimate_breakpoints = TRUE, fixed = TRUE) # Plot output plots <- report_plots(summarised_estimates = fbkp$summarised, reported = reported_cases) plots$summary# Pull out breakpoint summary fbkp$summarised[variable == "breakpoints"]#> date variable strat type bottom top lower upper #> 1: <NA> breakpoints 1 <NA> 0.5176252 0.6102732 0.5371678 0.5731114 #> central_lower central_upper median mean sd #> 1: 0.5554257 0.5675351 0.5606624 0.562662 0.02905336# Run model without Rt estimation (just backcalculation) backcalc <- estimate_infections(reported_cases, generation_time = generation_time, delays = list(incubation_period, reporting_delay), stan_args = list(warmup = 200, cores = ifelse(interactive(), 4, 1)), estimate_rt = FALSE) # plot just infections as report_plots does not support the backcalculation only model plot_estimates(estimate = backcalc$summarised[variable == "infections"], reported = reported_cases, ylab = "Cases")# }