# Gaussian Process implementation details

Source:`vignettes/gaussian_process_implementation_details.Rmd`

`gaussian_process_implementation_details.Rmd`

## Overview

We make use of Gaussian Processes in several places in
`EpiNow2`

. For example, the default model for
`estimate_infections()`

uses a Gaussian Process to model the
1st order difference on the log scale of the reproduction number. This
vignette describes the implementation details of the approximate
Gaussian Process used in `EpiNow2`

.

## Definition

The single dimension Gaussian Processes (\(\mathrm{GP}_t\)) we use can be written as

\[\begin{equation} \mathcal{GP}(\mu(t), k(t, t')) \end{equation}\]

where \(\mu(t)\) and \(k(t,t')\) are the mean and covariance functions, respectively. In our case as set out above, we have

\[\begin{equation} \mu(t) \equiv 0 \\ k(t,t') = k(|t - t'|) = k(\Delta t) \end{equation}\]

where by default \(k\) is a Matern 3/2 covariance kernel,

\[\begin{equation} k(\Delta t) = \alpha \left( 1 + \frac{\sqrt{3} \Delta t}{l} \right) \exp \left( - \frac{\sqrt{3} \Delta t}{l}\right) \end{equation}\]

with \(l>0\) and \(\alpha > 0\) the length scale and magnitude, respectively, of the kernel. Alternatively, a squared exponential kernel can be chosen to constrain the GP to be smoother.

\[\begin{equation} k(\Delta t) = \alpha \exp \left( - \frac{1}{2} \frac{(\Delta t^2)}{l^2} \right) \end{equation}\]

## Hilbert space approximation

In order to make our models computationally tractable, we approximate
the Gaussian Process using a Hilbert space approximation to the Gaussian
Process^{[1]}, centered around
mean zero.

\[\begin{equation} \mathcal{GP}(0, k(\Delta t)) \approx \sum_{j=1}^m \left(S_k(\sqrt{\lambda_j}) \right)^\frac{1}{2} \phi_j(t) \beta_j \end{equation}\]

with \(m\) the number of basis
functions to use in the approximation, which we calculate from the
number of time points \(t_\mathrm{GP}\)
to which the Gaussian Process is being applied (rounded up to give an
integer value), as is recommended^{[1]}.

\[\begin{equation} m = b t_\mathrm{GP} \end{equation}\]

and values of \(\lambda_j\) given by

\[\begin{equation} \lambda_j = \left( \frac{j \pi}{2 L} \right)^2 \end{equation}\]

where \(L\) is a positive number termed boundary condition, and \(\beta_{j}\) are regression weights with standard normal prior

\[\begin{equation} \beta_j \sim \mathcal{Normal}(0, 1) \end{equation}\]

The function \(S_k(x)\) is the
spectral density relating to a particular covariance function \(k\). In the case of the Matern 3/2 kernel
(the default in `EpiNow2`

) this is given by

\[\begin{equation} S_k(x) = 4 \alpha^2 \left( \frac{\sqrt{3}}{\rho}\right)^3 \left(\left( \frac{\sqrt{3}}{\rho} \right)^2 + w^2 \right)^{-2} \end{equation}\]

and in the case of a squared exponential kernel by

\[\begin{equation} S_k(x) = \alpha^2 \sqrt{2\pi} \rho \exp \left( -\frac{1}{2} \rho^2 w^2 \right) \end{equation}\]

The functions \(\phi_{j}(x)\) are the eigenfunctions of the Laplace operator,

\[\begin{equation} \phi_j(t) = \frac{1}{\sqrt{L}} \sin\left(\sqrt{\lambda_j} (t^* + L)\right) \end{equation}\]

with time rescaled linearly to be between -1 and 1,

\[\begin{equation} t^* = \frac{t - \frac{1}{2}t_\mathrm{GP}}{\frac{1}{2}t_\mathrm{GP}} \end{equation}\]

Relevant priors are

\[\begin{align} \alpha &\sim \mathcal{Normal}(0, \sigma_{\alpha}) \\ \rho &\sim \mathcal{LogNormal} (\mu_\rho, \sigma_\rho)\\ \end{align}\]

with \(\rho\) additionally constrained to be between \(\rho_\mathrm{min}\) and \(\rho_\mathrm{max}\), \(\mu_{\rho}\) and \(\sigma_\rho\) calculated from given mean \(m_{\rho}\) and standard deviation \(s_\rho\), and default values (all of which can be changed by the user):

\[\begin{align} b &= 0.2 \\ L &= 1.5 \\ m_\rho &= 21 \\ s_\rho &= 7 \\ \rho_\mathrm{min} &= 0\\ \rho_\mathrm{max} &= 60\\ \sigma_\alpha &= 0.05\\ \end{align}\]

## References

*Practical hilbert space approximate bayesian gaussian processes for probabilistic programming*. https://arxiv.org/abs/2004.11408