### Summary

The package currently supports two models both with various options. A baseline single strain model and a two strain model with flexible variable dynamics. Both share the same underlying structure with expected notifications being assumed to be a product of past expected notifications, and an exponentiated time-varying term. This term is then modelled in both instances as a differenced autoregressive process with a signal lag term. In the two strain model this process can either be modelled jointly for variants, as a correlated process, or independently. By default it is assumed to vary inline with the timestep of the data but this can be altered to speed up computation or to improve out of sample performance. Observed cases are estimated from expected cases by assuming either a negative binomial or Poisson observation model. In the two strain model variant of concern positive samples are estimated using a Beta-binomial or binomial observation model. Detailed model definitions are given below. Prior values when given represent the package defaults and can, and generally should, altered by the user based on their domain specific knowledge.

### Single strain

We model the mean (\(\lambda_t\)) of reported cases (\(C_t\)) as an order 1 autoregressive (AR(P)) process by time unit (\(t\)). The model is initialised by assuming that the initial reported cases are representative with a small amount of error (2.5%) for each \(t \lt P\).

\[\begin{align} \lambda_t &\sim \text{LogNormal}\left(\log C_t , 0.025 \times \log C_t \right),\ t \leq P \\ \lambda_t &= \text{exp}\left(r_t\right) \sum_{p = 1}^{P} \alpha_p \lambda_{t-p},\ t \gt P \end{align}\]

Where \(r_t\) can be interpreted as
the growth rate and the exponential of \(r_t\) as the effective reproduction number
(\(R_t\)) assuming a mean generation
time equal to the scaling rate used (see the documentation for
`scale_r`

in `?forecast()`

). \(r_t\) is then itself modelled as a
piecewise constant differenced AR(1) process,

\[\begin{align} r_1 &\sim \text{Normal}\left(0, 0.25 \right) \\ r_t &= r_{t-1} + \epsilon_t \\ \epsilon_1 &= \eta_1 \\ \epsilon_t &= \beta \epsilon_{t-1} + \eta_t \end{align}\]

Where,

\[\begin{align} \eta_t &\sim \text{Normal}\left(0, \sigma \right) \\ \sigma &\sim \text{Normal}\left(0, 0.2 \right) \\ \beta &\sim \text{Normal}\left(0, 0.5 \right) \end{align}\]

We then assume a negative binomial observation model with overdispersion \(\phi_c\) for reported cases (\(C_t\)),

\[\begin{align} C_{t} &\sim \text{NegBinomial}\left(\lambda_t, \phi_c\right) \\ \frac{1}{\sqrt{\phi_c}} &\sim \text{Normal}(0, 0.5) \end{align}\]

Where \(\sigma\), and \(\frac{1}{\sqrt{phi_c}}\) are truncated to
be greater than 0 and \(\beta\) is
truncated to be between -1 and 1. Optionally a Poisson observation model
may instead be used (see the documentation for
`overdispersion`

in `?forecast()`

).

The stan code for this model available here
or can be loaded into `R`

using the following code,

```
library(forecast.vocs)
readLines(fv_model(strains = 1, compile = FALSE))
```

### Two strain

We model strain dynamics using the single strain model as a starting point but with the addition of strain specific AR(P) case model and a beta binomial (or optionally binomial) observation process for sequence data. The variant of concerns growth rate is modelled either as a fixed scaling of the non-variant of concern growth rate, as an independent AR(1) process, or in a vector autoregression framework as a correlated AR(1) process. This last formulation is described in detail below along with the modifications required to define the other two formulations. Parameters related to the variant of concern (VoC) are given the \(\delta\) superscript and parameters related to non-VoC cases are given the \(o\) superscript.

Mean reported cases are again defined using a AR(1) process on the log scale for each strain and then combined for overall mean reported cases.

\[\begin{align} \lambda_t &\sim \text{LogNormal}\left(\log C_t , 0.025 \times \log C_t \right) ,\ t \leq P\\ \lambda^{\delta}_t &\sim \text{LogNormal}\left(\log C^{\delta}_t , 0.025 \times \log C^{\delta}_t \right),\ t \leq P \\ \lambda^{o}_t &= \lambda_t - \lambda^{\delta}_t,\ t \leq P \\ \lambda^{\delta}_t &= \text{exp}\left(r^{\delta}_t\right) \sum_{p = 1}^{P} \alpha^{\delta}_p \lambda^{\delta}_{t-p},\ t \gt P \\ \lambda^{o}_t &= \text{exp}\left(r^{o}_t\right) \sum_{p = 1}^{P} \alpha^{o}_p \lambda^{o}_{t-p}, t \gt P \\ \lambda_t &= \lambda^{\delta}_t + \lambda^{o}_t \end{align}\]

Where \(C^{\delta}_0\) is derived by calculating the mean proportion of cases that had the VoC for the first time point using the overall number of reported cases, the number of sequenced cases, and the number of sequences that were positive for the VoC. The growth rate for both VoC and non-VoC cases (\(r^{o, \delta}_t\)) is again modelled as differenced AR(1) processes but now combined with a variant specific modifier (\(s^{o, \delta}_0\)) to growth when variant sequences are first reported (\(t_{seq}\)), and a correlation structure for the time and strain varying error terms (\(\eta^{o, \delta}\)).

\[\begin{align} r^{o}_1 &\sim \text{Normal}\left(0, 0.25 \right),\\ r^{\delta}_{t_{seq}} &= r^{o}_{t_{seq}} + s^{\delta} \\ r^{o, \delta}_t &= r^{o, \delta}_{t-1} + \epsilon^{o, \delta}_t \\ \epsilon^{o, \delta}_0 &= \eta^{o, \delta}_0 \\ \epsilon^{o, \delta}_t &= \beta \epsilon^{o, \delta}_{t-1} + \eta^{o, \delta}_t \end{align}\]

Where,

\[\begin{align} s^{\beta} &\sim \text{Normal}(0, 0.5) \\ \eta^{o, \delta}_t &\sim \text{MVN}\left(0, \boldsymbol \Sigma \right) \\ \beta &\sim \text{Normal}\left(0, 0.5 \right) \end{align}\]

Where \(\boldsymbol \Sigma\) is a
\(2 \times 2\) covariance matrix which
we decompose for computational stability into a diagonal matrix
containing variant specific scale parameters (\(\boldsymbol \Delta\)) and a symmetric
correlation matrix (\(\boldsymbol
\Omega\)) as follows^{[1]},

\[\begin{align} \boldsymbol \Sigma &= \boldsymbol \Delta \boldsymbol \Omega \boldsymbol \Delta \\ \boldsymbol \Delta &= \left[ {\begin{array}{cc} \sigma^o & 0 \\ 0& \sigma^{\delta} \\ \end{array} } \right] \\ \boldsymbol \Omega &\sim \left[ {\begin{array}{cc} 1 & \omega \\ \omega & 1 \\ \end{array} } \right] \\ \sigma^{o, \delta} &\sim \text{Normal}\left(0, 0.2 \right) \end{align}\]

Where \(\boldsymbol \Omega\) has a Lewandowski-Kurowicka-Joe (LKJ) prior where \(\omega\) controls the prior expectation for correlation between variants (with \(\omega = 1\) resulting in a uniform prior over all correlations, \(\omega \lt 1\) placing more weight on larger correlations and \(\omega \gt 1\) placing more weight on small amounts of correlations). By default we set \(\omega = 0.5\) giving more weight to correlations between variants dynamics.

On top of this correlated strain dynamic model
`forecast.vocs`

also offers two other options that represent
extremes in the correlated model. The first of these assumes that both
strains evolve with independent differenced AR(1) growth rates with only
a scaling factor (s^{}) linking them. The second assumes that strains
growth rates are completely correlated in time (i.e they are governed by
a single AR(1) differenced process) and only differ based on a scaling
factor (\(s^{\delta}\)). See the
documentation for `variant_relationship`

in
`?forecast()`

for details.

We then assume a negative binomial observation model with overdispersion \(\phi_c\) for reported cases (\(C_t\)) as in the single strain model,

\[\begin{align} C_{t} &\sim \text{NegBinomial}\left(\lambda_t, \phi_c\right) \\ \frac{1}{\sqrt{\phi_c}} &\sim \text{Normal}(0, 0.5) \end{align}\]

Where \(\sigma\), and \(\frac{1}{\sqrt{phi_c}}\) are truncated to
be greater than 0 and \(\beta\) is
truncated to be between -1 and 1. Again a Poisson observation model may
instead be used (see the documentation for `overdispersion`

in `?forecast()`

).

Finally, the mean proportion of samples that have the VoC (\(p_t\)) is then estimated using the mean reported cases with the VoC and the overall mean reported cases.

\[\begin{equation} p_t = \frac{\lambda^{\delta}_t}{\lambda_t} \end{equation}\]

We assume a beta binomial observation model for the number of sequences (\(N_t\)) that are positive (\(P_t\)) for the VoC with overdispersion \(\phi_s\).

\[\begin{align} P_{t} &\sim \mathrm{BetaBinomial}\left(N_t, p_t \phi_s, (1 - p_t) \phi_s\right) \\ \frac{1}{\sqrt{\phi_s}} &\sim \text{Normal}(0, 0.5) \end{align}\]

Where \(\sigma^{o, \delta}\), and
\(\frac{1}{\sqrt{\phi_s}}\) are
truncated to be greater than 0. A binomial observation model is also
available (see the documentation for `overdispersion`

in
`?forecast()`

).

The stan code for this model available here
or can be loaded into `R`

using the following code,

```
library(forecast.vocs)
readLines(fv_model(strains = 2, compile = FALSE))
```

## Summary statistics

As well as posterior predictions and forecasts for both notifications by variant and variant of concern proportion the models also return several summary statistics which may be useful for drawing inferences about underlying transmission dynamics. These include the log scale growth rate (\(g^{o, \delta}_t\)), the instantaneous effective reproduction number (\(R^{o, \delta}_t\)), and the transmission advantage of the variant of concern (\(A_t\)). These are calculated as follows:

\[\begin{align} g^{o, \delta}_t &= T_s r^{o, \delta}_t \\ R^{o, \delta}_t &= \text{exp}\left(T_s r^{o, \delta}_t\right) \\ A_t &= \text{exp}\left(T_s \left(r^{\delta}_t - r^{o}_t \right)\right)\\ \end{align}\]

\(T_s\) is a user set scaling
parameter that defines the timespan over which the summary metrics apply
dependent on the time step of the data. It can be set using the
`scale_r`

and defaults to 1 which returns summary statistics
scaled to the timestep of the data. Depending on the setup of the model
used these summary measures will be more or less related to their
epidemiological definitions. In particular, adding a weighting to past
expected cases that is more complex than a simple lag may cause
interpretation issues.

## Implementation

The models are implemented in `stan`

using
`cmdstanr`

with no defaults altered^{[1,2]}. Due to the complexity of the
posterior it is likely that increasing the `adapt_delta`

may
be required to mitigate potential bias in posterior estimates^{[3]}. `forecast.vocs`

incorporates additional functionality written in R^{[4]} to enable plotting forecasts and
posterior predictions, summarising forecasts, and scoring them using
`scoringutils`

^{[5]}.
All functionality is modular allowing users to extend and alter the
underlying model whilst continuing to use the package framework.

## References

*Stan modeling language users guide and reference manual, 2.28.1*.

*Cmdstanr: R interface to ’CmdStan’*.

*Stan Case Studies*,

*4*. https://mc-stan.org/users/documentation/case-studies/divergences_and_bias.html

*R: A language and environment for statistical computing*. R Foundation for Statistical Computing. https://www.R-project.org/

*Scoringutils: A collection of proper scoring rules and metrics to assess predictions*. https://github.com/epiforecasts/scoringutils