Skip to contents

[Maturing] Efficiently runs epinow() across multiple regions in an efficient manner and conducts basic data checks and cleaning such as removing regions with fewer than non_zero_points as these are unlikely to produce reasonable results whilst consuming significant resources. See the documentation for epinow for further information.

By default all arguments supporting input from _opts() functions are shared across regions (including delays, truncation, Rt settings, stan settings, and gaussian process settings). Region specific settings are supported by passing a named list of _opts() calls (with an entry per region) to the relevant argument. A helper function (opts_list) is available to facilitate building this list.

Regions can be estimated in parallel using the {future} package (see setup_future). The progress of producing estimates across multiple regions is tracked using the progressr package. Modify this behaviour using progressr::handlers and enable it in batch by setting R_PROGRESSR_ENABLE=TRUE as an environment variable.

Usage

regional_epinow(
  reported_cases,
  generation_time,
  delays = delay_opts(),
  truncation = trunc_opts(),
  rt = rt_opts(),
  backcalc = backcalc_opts(),
  gp = gp_opts(),
  obs = obs_opts(),
  stan = stan_opts(),
  horizon = 7,
  CrIs = c(0.2, 0.5, 0.9),
  target_folder = NULL,
  target_date,
  non_zero_points = 2,
  output = c("regions", "summary", "samples", "plots", "latest"),
  return_output = FALSE,
  summary_args = list(),
  verbose = FALSE,
  logs = tempdir(check = TRUE),
  ...
)

Arguments

reported_cases

A data frame of confirmed cases (confirm) by date (date), and region (region).

generation_time

A call to generation_time_opts() defining the generation time distribution used. For backwards compatibility a list of summary parameters can also be passed.

delays

A call to delay_opts() defining delay distributions and options. See the documentation of delay_opts() and the examples below for details.

truncation

A call to trunc_opts() defining the truncation of observed data. Defaults to trunc_opts(). See estimate_truncation() for an approach to estimating truncation from data.

rt

A list of options as generated by rt_opts() defining Rt estimation. Defaults to rt_opts(). Set to NULL to switch to using back calculation rather than generating infections using Rt.

backcalc

A list of options as generated by backcalc_opts() to define the back calculation. Defaults to backcalc_opts().

gp

A list of options as generated by gp_opts() to define the Gaussian process. Defaults to gp_opts().Set to NULL to disable the Gaussian process.

obs

A list of options as generated by obs_opts() defining the observation model. Defaults to obs_opts().

stan

A list of stan options as generated by stan_opts(). Defaults to stan_opts(). Can be used to override data, init, and verbose settings if desired.

horizon

Numeric, defaults to 7. Number of days into the future to forecast.

CrIs

Numeric vector of credible intervals to calculate.

target_folder

Character string specifying where to save results (will create if not present).

target_date

Date, defaults to maximum found in the data if not specified.

non_zero_points

Numeric, the minimum number of time points with non-zero cases in a region required for that region to be evaluated. Defaults to 7.

output

A character vector of optional output to return. Supported options are the individual regional estimates ("regions"), samples ("samples"), plots ("plots"), copying the individual region dated folder into a latest folder (if target_folder is not null, set using "latest"), the stan fit of the underlying model ("fit"), and an overall summary across regions ("summary"). The default is to return samples and plots alongside summarised estimates and summary statistics. If target_folder is not NULL then the default is also to copy all results into a latest folder.

return_output

Logical, defaults to FALSE. Should output be returned, this automatically updates to TRUE if no directory for saving is specified.

summary_args

A list of arguments passed to regional_summary. See the regional_summary documentation for details.

verbose

Logical defaults to FALSE. Outputs verbose progress messages to the console from epinow.

logs

Character path indicating the target folder in which to store log information. Defaults to the temporary directory if not specified. Default logging can be disabled if logs is set to NULL. If specifying a custom logging setup then the code for setup_default_logging and the setup_logging function are a sensible place to start.

...

Pass additional arguments to epinow. See the documentation for epinow for details.

Value

A list of output stratified at the top level into regional output and across region output summary output

See also

epinow estimate_infections forecast_infections

setup_future regional_summary

Examples

# \donttest{
# set number of cores to use
old_opts <- options()
options(mc.cores = ifelse(interactive(), 4, 1))

# construct example distributions
generation_time <- get_generation_time(
 disease = "SARS-CoV-2", source = "ganyani"
)
incubation_period <- get_incubation_period(
 disease = "SARS-CoV-2", source = "lauer"
)
reporting_delay <- dist_spec(
  mean = convert_to_logmean(2, 1),
  mean_sd = 0.1,
  sd = convert_to_logsd(2, 1),
  sd_sd = 0.1, max = 15
)

# uses example case vector
cases <- example_confirmed[1:60]
cases <- data.table::rbindlist(list(
  data.table::copy(cases)[, region := "testland"],
  cases[, region := "realland"]
))

# run epinow across multiple regions and generate summaries
# samples and warmup have been reduced for this example
def <- regional_epinow(
  reported_cases = cases,
  generation_time = generation_time_opts(generation_time),
  delays = delay_opts(incubation_period + reporting_delay),
  rt = rt_opts(prior = list(mean = 2, sd = 0.2)),
  stan = stan_opts(
    samples = 100, warmup = 200,
    control = list(adapt_delta = 0.95)
  ),
  verbose = interactive()
)
#> INFO [2023-09-26 15:58:00] Producing following optional outputs: regions, summary, samples, plots, latest
#> Logging threshold set at INFO for the EpiNow2 logger
#> Writing EpiNow2 logs to the console and: /tmp/RtmpH53zkW/regional-epinow/2020-04-21.log
#> Logging threshold set at INFO for the EpiNow2.epinow logger
#> Writing EpiNow2.epinow logs to: /tmp/RtmpH53zkW/epinow/2020-04-21.log
#> INFO [2023-09-26 15:58:00] Reporting estimates using data up to: 2020-04-21
#> INFO [2023-09-26 15:58:00] No target directory specified so returning output
#> INFO [2023-09-26 15:58:00] Producing estimates for: testland, realland
#> INFO [2023-09-26 15:58:00] Regions excluded: none
#> INFO [2023-09-26 16:00:48] Completed estimates for: testland
#> INFO [2023-09-26 16:03:16] Completed estimates for: realland
#> INFO [2023-09-26 16:03:16] Completed regional estimates
#> INFO [2023-09-26 16:03:16] Regions with estimates: 2
#> INFO [2023-09-26 16:03:16] Regions with runtime errors: 0
#> INFO [2023-09-26 16:03:16] Producing summary
#> INFO [2023-09-26 16:03:16] No summary directory specified so returning summary output
#> INFO [2023-09-26 16:03:16] No target directory specified so returning timings

# apply a different rt method per region
# (here a gaussian process and a weekly random walk)
gp <- opts_list(gp_opts(), cases)
gp <- update_list(gp, list(realland = NULL))
rt <- opts_list(rt_opts(), cases, realland = rt_opts(rw = 7))
region_rt <- regional_epinow(
  reported_cases = cases,
  generation_time = generation_time_opts(generation_time),
  delays = delay_opts(incubation_period + reporting_delay),
  rt = rt, gp = gp,
  stan = stan_opts(
    samples = 100, warmup = 200,
    control = list(adapt_delta = 0.95)
  ),
  verbose = interactive()
)
#> INFO [2023-09-26 16:03:16] Producing following optional outputs: regions, summary, samples, plots, latest
#> Logging threshold set at INFO for the EpiNow2 logger
#> Writing EpiNow2 logs to the console and: /tmp/RtmpH53zkW/regional-epinow/2020-04-21.log
#> Logging threshold set at INFO for the EpiNow2.epinow logger
#> Writing EpiNow2.epinow logs to: /tmp/RtmpH53zkW/epinow/2020-04-21.log
#> INFO [2023-09-26 16:03:16] Reporting estimates using data up to: 2020-04-21
#> INFO [2023-09-26 16:03:16] No target directory specified so returning output
#> INFO [2023-09-26 16:03:16] Producing estimates for: testland, realland
#> INFO [2023-09-26 16:03:16] Regions excluded: none
#> INFO [2023-09-26 16:05:58] Completed estimates for: testland
#> INFO [2023-09-26 16:06:35] Completed estimates for: realland
#> INFO [2023-09-26 16:06:35] Completed regional estimates
#> INFO [2023-09-26 16:06:35] Regions with estimates: 2
#> INFO [2023-09-26 16:06:35] Regions with runtime errors: 0
#> INFO [2023-09-26 16:06:35] Producing summary
#> INFO [2023-09-26 16:06:35] No summary directory specified so returning summary output
#> INFO [2023-09-26 16:06:35] No target directory specified so returning timings

options(old_opts)
# }