chain_sim()
is a stochastic simulator for generating
transmission chain data with key inputs such as the offspring distribution
and serial interval distribution.
Usage
chain_sim(
n,
offspring,
stat = c("size", "length"),
infinite = Inf,
tree = FALSE,
serial,
t0 = 0,
tf = Inf,
...
)
Arguments
- n
Number of simulations to run.
- offspring
Offspring distribution: a character string corresponding to the R distribution function (e.g., "pois" for Poisson, where
rpois
is the R function to generate Poisson random numbers)- stat
String; Statistic to calculate. Can be one of:
"size": the total number of offspring.
"length": the total number of ancestors.
- infinite
A size or length above which the simulation results should be set to
Inf
. Defaults toInf
, resulting in no results ever set toInf
- tree
Logical. Should the transmission tree be returned? Defaults to
FALSE
.- serial
The serial interval generator function; the name of a user-defined named or anonymous function with only one argument
n
, representing the number of serial intervals to generate.- t0
Start time (if serial interval is given); either a single value or a vector of length
n
(number of simulations) with initial times. Defaults to 0.- tf
End time (if serial interval is given).
- ...
Parameters of the offspring distribution as required by R.
Value
Either:
A vector of sizes/lengths (if
tree == FALSE
OR serial interval function not specified, since that impliestree == FALSE
), ora data frame with columns
n
(simulation ID),time
(if the serial interval is given) and (iftree == TRUE
),id
(a unique ID within each simulation for each individual element of the chain),ancestor
(the ID of the ancestor of each element), andgeneration
.
Details
chain_sim()
either returns a vector or a data.frame. The output is
either a vector if serial
is not provided, which automatically sets
tree = FALSE
, or a data.frame
, which means that serial
was
provided as a function. When serial
is provided, it means
tree = TRUE
automatically. However, setting tree = TRUE
would require providing a function for serial
.
The serial interval (serial
):
Assumptions/disambiguation
In epidemiology, the generation interval is the duration between successive infectious events in a chain of transmission. Similarly, the serial interval is the duration between observed symptom onset times between successive cases in a transmission chain. The generation interval is often hard to observe because exact times of infection are hard to measure hence, the serial interval is often used instead. Here, we use the serial interval to represent what would normally be called the generation interval, that is, the time between successive cases.
Specifying serial
in chain_sim()
serial
must be specified as a named or
anonymous/inline/unnamed function # nolint
with one argument.
If serial
is specified, chain_sim()
returns times of
infection as a column in the output. Moreover, specifying a function
for serial
implies tree = TRUE
and a tree of
infectors (ancestor
) and infectees (id
) will be generated in the output.
For example, assuming we want to specify the serial interval
generator as a random log-normally distributed variable with
meanlog = 0.58
and sdlog = 1.58
, we could define a named function,
let's call it "serial_interval", with only one argument representing the
number of serial intervals to sample:
serial_interval <- function(n){rlnorm(n, 0.58, 1.38)}
,
and assign the name of the function to serial in chain_sim()
like so
chain_sim(..., serial = serial_interval)
,
where ...
are the other arguments to chain_sim()
. Alternatively, we
could assign an anonymous function to serial in the chain_sim()
call
like so chain_sim(..., serial = function(n){rlnorm(n, 0.58, 1.38)})
,
where ...
are the other arguments to chain_sim()
.
Examples
# Specifying no `serial` and `tree == FALSE` (default) returns a vector
set.seed(123)
chain_sim(n = 5, offspring = "pois", stat = "size", lambda = 0.5,
tree = FALSE)
#> [1] 1 2 1 2 4
# Specifying `serial` without specifying `tree` will set `tree = TRUE`
# internally.
# We'll first define the serial function
set.seed(123)
serial_interval <- function(n) {
rlnorm(n, meanlog = 0.58, sdlog = 1.58)
}
chain_sim(
n = 5, offspring = "pois", lambda = 0.5, stat = "length",
infinite = 100,
serial = serial_interval
)
#> n id ancestor generation time
#> 1 1 1 NA 1 0.0000000
#> 2 2 1 NA 1 0.0000000
#> 3 3 1 NA 1 0.0000000
#> 4 4 1 NA 1 0.0000000
#> 5 5 1 NA 1 0.0000000
#> 6 2 2 1 2 0.1237492
#> 7 4 2 1 2 12.6594440
#> 8 5 2 1 2 1.5035572
#> 9 5 3 1 2 1.4840246
#> 10 5 4 2 3 1.6201477
#> 11 5 5 4 4 13.9750044
#> 12 5 6 4 4 4.7736253
#> 13 5 7 5 5 16.1023578
#> 14 5 8 6 5 5.5157567
#> 15 5 9 7 6 20.0243658
#> 16 5 10 7 6 16.1822358
#> 17 5 11 8 6 10.9251791
# Specifying `serial` and `tree = FALSE` will throw a warning saying that
# `tree` was set to `TRUE` internally.
set.seed(123)
if (FALSE) { # \dontrun{
try(chain_sim(
n = 10, serial = function(x) 3, offspring = "pois", lambda = 2,
infinite = 10, tree = FALSE
))
} # }