Skip to contents

chain_sim() is a stochastic simulator for generating transmission chain data with key inputs such as the offspring distribution and serial interval distribution.

Usage

chain_sim(
  n,
  offspring,
  stat = c("size", "length"),
  infinite = Inf,
  tree = FALSE,
  serial,
  t0 = 0,
  tf = Inf,
  ...
)

Arguments

n

Number of simulations to run.

offspring

Offspring distribution: a character string corresponding to the R distribution function (e.g., "pois" for Poisson, where rpois is the R function to generate Poisson random numbers)

stat

String; Statistic to calculate. Can be one of:

  • "size": the total number of offspring.

  • "length": the total number of ancestors.

infinite

A size or length above which the simulation results should be set to Inf. Defaults to Inf, resulting in no results ever set to Inf

tree

Logical. Should the transmission tree be returned? Defaults to FALSE.

serial

The serial interval generator function; the name of a user-defined named or anonymous function with only one argument n, representing the number of serial intervals to generate.

t0

Start time (if serial interval is given); either a single value or a vector of length n (number of simulations) with initial times. Defaults to 0.

tf

End time (if serial interval is given).

...

Parameters of the offspring distribution as required by R.

Value

Either:

  • A vector of sizes/lengths (if tree == FALSE OR serial interval function not specified, since that implies tree == FALSE), or

  • a data frame with columns n (simulation ID), time (if the serial interval is given) and (if tree == TRUE), id (a unique ID within each simulation for each individual element of the chain), ancestor (the ID of the ancestor of each element), and generation.

Details

chain_sim() either returns a vector or a data.frame. The output is either a vector if serial is not provided, which automatically sets tree = FALSE, or a data.frame, which means that serial was provided as a function. When serial is provided, it means tree = TRUE automatically. However, setting tree = TRUE would require providing a function for serial.

The serial interval (serial):

Assumptions/disambiguation

In epidemiology, the generation interval is the duration between successive infectious events in a chain of transmission. Similarly, the serial interval is the duration between observed symptom onset times between successive cases in a transmission chain. The generation interval is often hard to observe because exact times of infection are hard to measure hence, the serial interval is often used instead. Here, we use the serial interval to represent what would normally be called the generation interval, that is, the time between successive cases.

Specifying serial in chain_sim()

serial must be specified as a named or anonymous/inline/unnamed function # nolint with one argument.

If serial is specified, chain_sim() returns times of infection as a column in the output. Moreover, specifying a function for serial implies tree = TRUE and a tree of infectors (ancestor) and infectees (id) will be generated in the output.

For example, assuming we want to specify the serial interval generator as a random log-normally distributed variable with meanlog = 0.58 and sdlog = 1.58, we could define a named function, let's call it "serial_interval", with only one argument representing the number of serial intervals to sample: serial_interval <- function(n){rlnorm(n, 0.58, 1.38)}, and assign the name of the function to serial in chain_sim() like so chain_sim(..., serial = serial_interval), where ... are the other arguments to chain_sim(). Alternatively, we could assign an anonymous function to serial in the chain_sim() call like so chain_sim(..., serial = function(n){rlnorm(n, 0.58, 1.38)}), where ... are the other arguments to chain_sim().

Author

Sebastian Funk, James M. Azam

Examples

# Specifying no `serial` and `tree == FALSE` (default) returns a vector
set.seed(123)
chain_sim(n = 5, offspring = "pois", stat = "size", lambda = 0.5,
tree = FALSE)
#> [1] 1 2 1 2 4

# Specifying `serial` without specifying `tree` will set `tree = TRUE`
# internally.

# We'll first define the serial function
set.seed(123)
serial_interval <- function(n) {
  rlnorm(n, meanlog = 0.58, sdlog = 1.58)
}
chain_sim(
  n = 5, offspring = "pois", lambda = 0.5, stat = "length",
  infinite = 100,
  serial = serial_interval
)
#>    n id ancestor generation       time
#> 1  1  1       NA          1  0.0000000
#> 2  2  1       NA          1  0.0000000
#> 3  3  1       NA          1  0.0000000
#> 4  4  1       NA          1  0.0000000
#> 5  5  1       NA          1  0.0000000
#> 6  2  2        1          2  0.1237492
#> 7  4  2        1          2 12.6594440
#> 8  5  2        1          2  1.5035572
#> 9  5  3        1          2  1.4840246
#> 10 5  4        2          3  1.6201477
#> 11 5  5        4          4 13.9750044
#> 12 5  6        4          4  4.7736253
#> 13 5  7        5          5 16.1023578
#> 14 5  8        6          5  5.5157567
#> 15 5  9        7          6 20.0243658
#> 16 5 10        7          6 16.1822358
#> 17 5 11        8          6 10.9251791

# Specifying `serial` and `tree = FALSE` will throw a warning saying that
# `tree` was set to `TRUE` internally.
set.seed(123)
if (FALSE) { # \dontrun{
try(chain_sim(
  n = 10, serial = function(x) 3, offspring = "pois", lambda = 2,
  infinite = 10, tree = FALSE
))
} # }