Fit a delay distribution accounting for primary and secondary
event censoring (double interval censoring) and right
truncation.
Estimation is done via MCMC using a Stan model that vendors
likelihood functions from the
primarycensored
package.
For more flexible delay distribution modelling (e.g.
time-varying delays, partial pooling, or regression on
covariates), see the
epidist package.
If you use this function, please cite
primarycensored in
addition to EpiNow2.
Usage
estimate_dist(
data,
dist = "lognormal",
priors = switch(dist, lognormal = list(meanlog = Normal(1, 1), sdlog = Normal(0.5,
0.5)), gamma = list(shape = Normal(2, 2), rate = Normal(0.5, 0.5)), normal =
list(mean = Normal(5, 5), sd = Normal(1, 1)), exp = list(rate = Normal(0.5, 0.5)),
weibull = list(shape = Normal(2, 2), scale = Normal(5, 5))),
primary = "uniform",
primary_params = numeric(0),
stan = stan_opts(),
max_value = NULL,
obs_time_threshold = 2,
verbose = FALSE
)Arguments
- data
A data.frame with date columns:
pdate_lwr(required): lower bound of primary event datepdate_upr(optional): upper bound of primary event date (default:pdate_lwr + 1)sdate_lwr(required): lower bound of secondary event datesdate_upr(optional): upper bound of secondary event date (default:sdate_lwr + 1)obs_date(optional): observation/censoring date (default:max(sdate_upr))n(optional): observation count/weight (default: 1)
- dist
Character string, which distribution to fit. One of
"lognormal"(default),"gamma","normal","exp", or"weibull".- priors
A list of
<dist_spec>objects specifying priors for the distribution parameters. Names must match the parameters of the chosen distribution. Defaults depend ondist:lognormal:
list(meanlog = Normal(1, 1), sdlog = Normal(0.5, 0.5))gamma:
list(shape = Normal(2, 2), rate = Normal(0.5, 0.5))normal:
list(mean = Normal(5, 5), sd = Normal(1, 1))exp:
list(rate = Normal(0.5, 0.5))weibull:
list(shape = Normal(2, 2), scale = Normal(5, 5))
- primary
Character string specifying the primary event distribution. One of:
"uniform"(default): uniform distribution over the primary window"expgrowth": exponential growth distribution. Requiresprimary_paramsto supply a fixed growth rate.
- primary_params
Numeric vector of parameters for the primary distribution. Only used when
primary = "expgrowth", in which case it should be a single numeric value for the growth rate. The growth rate is passed as fixed data to Stan, not estimated.- stan
A list of stan options as generated by
stan_opts(). Defaults tostan_opts(). Can be used to overridedata,init, andverbosesettings if desired.- max_value
Numeric, maximum delay value for PMF. If not provided, inferred from data.
- obs_time_threshold
Numeric, multiplier for the obs-time-to-Inf heuristic. Observations where
relative_obs_time > max(delay_upr) * obs_time_thresholdare treated as untruncated. Default 2, followingepidist. Set toInfto disable.- verbose
Logical, print progress messages? Defaults to FALSE.
Value
An <estimate_dist> object (inheriting from
<epinowfit>) with components:
- fit
The Stan fit object.
- args
The Stan data list used for fitting.
- data
The input data.
Use get_parameters() to extract the fitted <dist_spec>.
Details
The model fits an interval-censored delay distribution while accounting for:
Primary event censoring (e.g., daily reporting of exposure)
Secondary event censoring (e.g., daily reporting of symptom onset)
Right truncation (observation window effects)
Per-observation truncation times (via
obs_date)
When a data frame with date columns is provided, observations
are aggregated by unique combinations of (delay_lwr, delay_upr, pwindow, relative_obs_time) to reduce the number
of likelihood evaluations.
Observations where the relative observation time is much
larger than the maximum observed delay are treated as
untruncated (observation time set to infinity).
The primarycensored Stan functions are vendored (included
in the package), so the model is pre-compiled and runs without
needing primarycensored at runtime.
References
Park SW, et al. (2024) "Estimating epidemiological delay distributions for infectious diseases." doi:10.1101/2024.01.12.24301247
Charniga K, Park SW, et al. (2024) "Best practices for estimating and reporting epidemiological delay distributions of infectious diseases." PLoS Comput Biol 20(10): e1012520. doi:10.1371/journal.pcbi.1012520
Please cite primarycensored if you use this function;
see citation("primarycensored").
See also
vignette("estimate-dist", package = "EpiNow2") for
a worked example, and
primarycensored::primarycensored-package for the underlying
censoring methodology.
Examples
# \donttest{
# Fit lognormal distribution from date-based linelist
if (requireNamespace("primarycensored", quietly = TRUE)) {
set.seed(1)
n <- 100
D <- 30
pdate_lwr <- as.Date("2023-01-01") + rpois(n, 5)
delays_sim <- primarycensored::rprimarycensored(
n = n, rdist = rlnorm,
meanlog = log(5), sdlog = 0.5,
pwindow = 1, D = D
)
linelist <- data.frame(
pdate_lwr = pdate_lwr,
sdate_lwr = pdate_lwr + delays_sim,
obs_date = pdate_lwr + D
)
result <- estimate_dist(linelist, dist = "lognormal")
}
# }
