Estimating the current size of the 2026 DRC Bundibugyo virus outbreak: a joint Bayesian re-analysis of the McCabe et al. report
Authors. Sam Abbott, Kath Sherratt, Samuel Brand and Sebastian Funk.
Last updated. 2026-05-22. This is a live report, re-run as new data arrive, so the estimates change between updates.
Data as of. 2026-05-18, the release date of the WHO AFRO External Situation Report 01 the counts are taken from. Estimates are reported as of this date; it can lag the update date above.
Abstract. An outbreak of Ebola disease caused by Bundibugyo virus (BVD) is ongoing in the Democratic Republic of the Congo (DRC), with cases also detected across the border in Uganda. Estimating the likely current size of the outbreak is useful for the response, but most cases are not yet reported and have to be inferred from the data streams that are available. The Imperial College London report (McCabe et al., 18 May 2026, revised in a 20 May 2026 update) estimates the size with two analyses, geographic spread from the cases exported to Uganda and back-calculation from suspected deaths in DRC. Building on that work, we re-analyse the same problem as a single joint Bayesian model over the latent cumulative case count, fitting all streams together with priors on the nuisance parameters that the report varies in scenario sweeps. Beyond the exported cases and DRC deaths the report uses, we condition on two further streams, the reported cases in DRC (with an ascertainment component) and the deaths among exported cases in Uganda. We also add a no-onward-transmission projected-deaths counterfactual, a one-week-ahead forecast and an onset-to-death delay sensitivity analysis, and replace two closed-form approximations (the deaths convolution and the small-growth-rate exports term) with their exact forms. We report the joint posterior over the cumulative case count from current data; to separate the effect of newer data from the change in method we also fit the model to the data as of each report version in sequence (18 May, then the 20 May update), comparing against both a joint reimplementation of the report's approach and its original published estimates at each version.
Scope. This work is motivated by adding an external view of the current situation, based on our understanding of real-time infectious disease dynamics and the infection process that gives rise to observed epidemic surveillance counts. We are actively developing it and encourage feedback, so please get in touch. We fully support reuse and adaptation. Find out more in the contributing guide.
Use of AI. The model code and this analysis were drafted by a language model and reviewed and revised under human oversight; the named authors are responsible for that oversight (see the LLM-driven reimplementation limitation below). The full analysis code lives in the epiforecasts/BVDOutbreakSize repository, where issues and suggestions are welcome. This page is generated from docs/examples/analysis.jl; the model code it calls is in src/.
How the numbers differ from McCabe et al. Our estimates differ from the McCabe et al. (McCabe and others, May 2026) report for two reasons. First, the method: we fit all streams jointly in a single Bayesian model rather than combining separate scenario analyses (see What we do differently below). Second, the data: results are reported as of the cut-off date in data/observations.toml (currently 2026-05-18), using the reported counts in the data table below. These match the 20 May report's deaths and cases (
Offline copy. A self-contained single-file HTML version of this report, built from the same run, is attached to each results release: download the latest.
See: current outbreak size · comparison with McCabe et al. · how the data streams compare · limitations.
What we do differently from McCabe et al.
Joint posterior, not 15 scenario estimates. The doubling time
, case fatality ratio (CFR), onset-to-death shape and scale, detection window , daily traveller volume and surveillance dispersion all have priors and are sampled jointly. McCabe et al. (McCabe and others, May 2026) fix each and sweep. Exact cumulative integral for exports —
with travel rate — rather than the small- simplification used by McCabe et al. (and by (Imai et al., Jan 2020) before them). The two forms agree as . Numerical (not closed-form) deaths convolution. For a gamma delay the convolution integral has an exact closed form that carries a
factor from the finite upper limit . McCabe et al. use the large- simplification , which drops that factor and is therefore an approximation. We evaluate the integral numerically instead, which recovers the exact value and lets the onset-to-death distribution be swapped for any other family. Onset-to-death prior anchored on the Bayesian reanalysis of the same Isiro 2012 line list McCabe et al. cite for their point estimates (Funk and Abbott, 2026), so the priors carry the published 95% credible intervals on
and rather than collapsing onto Rosello's point estimate. NegBinomial likelihood on deaths and reported cases with a single shared surveillance dispersion
. McCabe et al. use Poisson for deaths and do not have a cases-ascertainment model at all. Exports stay Poisson because two observations would not identify a separate dispersion. The McCabe et al. "exact NegBinomial CIs" on Method 1 are the conventional binomial-inversion procedure, not an estimated dispersion. Ascertainment extension (not in McCabe et al.). A logit-scale hyperprior on the reporting fraction, applied to the latent
, gives a joint posterior over the reported suspected-case count alongside deaths and exports. No-onward-transmission counterfactual (not in McCabe et al.). Projects the future expected deaths from cases already infected by
, integrating per draw — a lower bound on the eventual death toll if every onward transmission stopped today.
Limitations
Fitted only to aggregate reported counts. The data are a handful of summary figures — total suspected cases in the DRC, total suspected deaths in the DRC, and cases (and one death) detected in Uganda — from press and situation reports. There is no line list and no temporal information: no onset dates, no epidemic curve, no per-case data. The model also has no knowledge of the situation on the ground (case definitions, testing capacity, affected areas, interventions, reporting completeness). Every estimate is a model-based extrapolation from sparse summary statistics under strong assumptions rather than a measurement.
LLM-driven reimplementation. The model code, priors, convolution implementation and analysis were drafted by a language model from the published McCabe et al. (McCabe and others, May 2026) report and the companion delay reanalysis, then reviewed and revised. Not independently replicated against the authors' code.
Prior-driven inference where data is scarce. A dozen suspected exports, ~
deaths, and a single reported-case total give little information about , , the surveillance dispersion, or the reporting fraction individually. Posteriors track their priors closely. Inherits McCabe et al.'s epidemiological assumptions and core model structures. Exponential growth from a single zoonotic seed; the underlying case trajectory is treated as a deterministic function of the latent state (only the observation counts carry sampling noise via Poisson / NegBinomial) rather than a stochastic incidence process; the cumulative-case / deaths convolution structure for Method 2; the geographic- spread / detection-window structure for Method 1; no spatial structure beyond the Ituri / Nord Kivu split; no time series of cases or deaths.
Onset-to-death delay anchored on Isiro 2012. A single- outbreak fit; the delay distribution reporting here follows (Charniga et al., 2024) but cross-outbreak heterogeneity is unmodelled. The baseline fit uses the full Rosello onset-to-death distribution, as in McCabe et al. The delay sensitivity section refits the joint model with the community-only delay (the
cases who died without admission, weak evidence of a shorter delay) to show how much the outbreak-size estimate leans on the delay assumption. Genetic seeding bound depends on a fixed clock rate. The time to the most recent common ancestor (TMRCA) is dated under an external Ebola clock rate; the sampled tree is also almost entirely from Bunia. The clock-rate sensitivity section refits the joint model under the faster early-epidemic rate to show how much the timing, growth-rate and outbreak-size estimates are impacted.
Detection window is weakly motivated.
lumps incubation and onset-to-detection together — both poorly characterised for BVD — so the quantity itself is loosely defined. Its prior is even less grounded: it simply spans the 10–20 day windows McCabe et al. sweep, with no independent estimate behind it, so the exports stream leans on an assumption rather than data. Not all Uganda cases are confirmed exports. The exports likelihood treats every Uganda case as imported from DRC, but the 12 suspected cases reported in Kampala are not all confirmed to be importations — some may reflect onward transmission within Uganda or unrelated suspected cases later discarded. Counting non-exports as exports inflates the export signal and biases the implied outbreak size and ascertainment.
Selection bias in deaths-among-exports. The deaths-among- exports likelihood assumes Uganda's surveillance retains detected exports through to any subsequent death. If the system loses cases that progress to death, the observed count is biased downward and the constraint it places on
is overstated. Deaths-among-exports is an approximate construction. The expected count weights the exported at-risk person-time by the onset-to-death CDF
rather than convolving the death delay against the exported-case incidence; this treats the cohort present at time as if infected at . A more direct construction would convolve the onset-to-death delay against the exported-case incidence trajectory. Ascertainment partially pooled, not separately identified. Uganda's exported-case ascertainment
and DRC's reported-case ascertainment share a logit-scale hyperprior. With a handful of suspected exports and one export death the Uganda fraction is weakly identified and leans on the pooled mean and the DRC side. Observation streams share one case pool. The four streams are modelled as conditionally independent given the latent cumulative incidence, but they observe overlapping individuals — exported cases are a subset of all cases (and may also be DRC-reported), and expected DRC deaths are computed over all incidence including those who travelled. Conditional independence ignores this individual-level overlap, so it can double-count evidence and understate uncertainty. The effect is small here because the Uganda counts are small.
Data conflict not explored in detail. We combine four data streams jointly but have not systematically checked whether they conflict — whether, say, the exports and the deaths streams imply different outbreak sizes. Characterising data-source properties and conflict is part of the modelling workflow we otherwise follow (Abbott et al., 2025); a fuller treatment is left for future work.
Load packages and seed the RNG
using Turing
using Turing: to_submodel, @varname
using Distributions
using StatsFuns: logit, logistic
using DataFrames: DataFrame
import CSV
using Random
using Markdown
using Dates: Date, Day, value
using BVDOutbreakSize
using BVDOutbreakSize: integrate_cumulative, integrate_exports_deaths,
expected_deaths
import CairoMakie
# Render figures at higher resolution so they stay crisp in the docs.
CairoMakie.activate!(type = "png", px_per_unit = 3)
Random.seed!(20260518)Random.TaskLocalRNG()Methods
Data
The analysis uses a handful of aggregate counts collated from situation reports and news coverage: the suspected cases and suspected deaths reported in the DRC, the cases (and any deaths) detected among travellers to Uganda, and the daily cross-border traveller volume and source-area population from the McCabe et al. report (McCabe and others, May 2026). All are point-in-time totals as of the data cut-off, not time series, and the suspected counts are unconfirmed. The table below lists each figure with its source. The source population is treated as fixed (census data); the daily outbound traveller volume is given a normal prior centred at the McCabe et al. figure with an SD covering point-of-entry variation.
Loading observations and building the data table
obs = load_observations()
observations_table = DataFrame(
field = [
"exported_cases",
"exports_deaths",
"total_deaths",
"reported_cases",
"daily_outbound_travellers",
"daily_outbound_travellers_sd",
"source_population",
],
value = [
obs.exported_cases,
obs.exports_deaths,
obs.total_deaths,
obs.reported_cases,
obs.daily_outbound_travellers,
obs.daily_outbound_travellers_sd,
obs.source_population,
],
source = [
obs.sources.exported_cases,
obs.sources.exports_deaths,
obs.sources.total_deaths,
obs.sources.reported_cases,
obs.sources.daily_outbound_travellers,
obs.sources.daily_outbound_travellers_sd,
obs.sources.source_population,
],
);
const ITURI_POPULATION = obs.source_population
const ITURI_DAILY_TRAVEL = obs.daily_outbound_travellers
const EXPORTED_CASES = obs.exported_cases
const EXPORTS_DEATHS = obs.exports_deaths1| Row | field | value | source |
|---|---|---|---|
| String | Float64 | String | |
| 1 | exported_cases | 2.0 | WHO AFRO, Ebola Bundibugyo virus disease outbreak (DRC | Uganda), Weekly External Situation Report 01, data as of 18 May 2026 — two confirmed imported cases in Kampala, Uganda |
| 2 | exports_deaths | 1.0 | WHO AFRO, Ebola Bundibugyo virus disease outbreak (DRC | Uganda), Weekly External Situation Report 01, data as of 18 May 2026 — one death among the two Uganda imported cases |
| 3 | total_deaths | 131.0 | WHO AFRO, Ebola Bundibugyo virus disease outbreak (DRC | Uganda), Weekly External Situation Report 01, data as of 18 May 2026 — DRC suspected deaths |
| 4 | reported_cases | 516.0 | WHO AFRO, Ebola Bundibugyo virus disease outbreak (DRC | Uganda), Weekly External Situation Report 01, data as of 18 May 2026 — DRC suspected cases |
| 5 | daily_outbound_travellers | 1871.0 | McCabe et al. Table 3 — mean daily PoE flow across seven Uganda-DRC border points, Ituri side |
| 6 | daily_outbound_travellers_sd | 200.0 | PoE-to-PoE variation and sitrep sampling uncertainty; ~10% CV |
| 7 | source_population | 4.3922e6 | McCabe et al. Table 1 — Ituri Province population |
Model
Model overview
We model a single outbreak seeded by one zoonotic introduction that then grows exponentially, so the cumulative number of people ever infected by outbreak age
Fitting all four streams together gives the posterior for the latent cumulative case count
In implementation terms, the model is assembled from small reusable Turing (Ge et al., 2018) submodels. Each building-block submodel owns the maths and priors for one epidemic parameter family. The observation submodels assemble those blocks, introduce the forward integrals and the likelihoods, and tie one data stream to the latent state. The composers combine the observation submodels into the per-stream fits and the joint fit.
The table below shows which building-block parameters feed each observation submodel:
| Parameter | Exports | Deaths | Cases | Export deaths (time-resolved) | First export-detection timing | Genetic seeding |
|---|---|---|---|---|---|---|
| Growth | ● | ● | ● | ● | ● | ● |
| Onset-to-death delay | ● | ● | ||||
| Case-fatality ratio | ● | ● | ||||
| Detection window | ● | ● | ● | |||
| Surveillance dispersion | ● | ● | ||||
| Ascertainment | ● | ● | ● | ● | ||
| Traveller volume | ● | ● | ● |
The model components, in the order they appear below:
Building-block submodels — one per parameter family (growth, onset-to-death delay, CFR, detection window, daily traveller volume, surveillance dispersion, ascertainment). Each samples its own priors and returns a small NamedTuple of values. These sections introduce only the maths for their own parameters.
Observation submodels — exports, deaths, cases, deaths-among- exports (time-resolved), and the first export-detection timing. Each takes the growth state as input, introduces the forward integral it needs and the likelihood, and ties one data stream to the latent
. Composers — one per analysis: the four single-stream fits, a two-stream reimplementation of the report's methods (exports and deaths), and the full joint fit. Each samples the building blocks and the relevant observation submodels. A composer conditionally includes only the likelihoods for the streams it uses, so a single-stream fit never instantiates the other observation submodels.
Beyond the model itself:
Inference — prior predictive, the four No-U-Turn Sampler (NUTS) fits, posterior summaries, posterior-predictive plots.
Counterfactual, forecast and sense check — a no-onward-transmission lower bound on cumulative deaths, a one-week-ahead forecast, and a
Turing.fix-pinned reproduction of Method 2 main scenario via the exports-and-deaths composer.
Building-block submodels
Each building-block submodel introduces only the mathematical objects and priors for one parameter family; the likelihoods and forward integrals enter later, in the observation submodels that use them.
The implementation approach taken here is based on the hantavirus modelling project (Funk and Abbott, 2026): Mooncake (Hancock et al., 2024) automatic differentiation, the Integrals (SciML contributors, 2024) quadrature helpers, a NaN-safe NegBinomial, FlexiChains, and PairPlots (Thompson, 2024) with AlgebraOfGraphics (Danisch and Krumbiegel, 2021) for the figures.
Growth — exponential
The outbreak is seeded
so that the cumulative case count at the cut-off is
Rather than sampling
This gives 95% prior support of roughly
Submodel: exponential_growth_model
@model function exponential_growth_model(;
tau_prior = LogNormal(log(14), 0.4),
m_prior = truncated(Normal(7.0, 2.5);
lower = 0, upper = 13.0))
τ ~ tau_prior
m ~ m_prior
r := log(2) / τ
T := m * τ
C_T := 2.0 ^ m
cumulative = s -> exp(r * s)
return (; τ, r, m, T, C_T, cumulative)
endexponential_growth_model (generic function with 2 methods)Genetic seeding bound
A BEAST time tree of the first ten sequenced genomes (Amuri-Aziza et al., 2026) places the TMRCA, the age of the oldest internal node of the tree, at a mean of 25 March 2026, with a 95% HPD interval of about
Submodel: genetic_seeding_model
@model function genetic_seeding_model(T, tmrca_days::Real;
tmrca_days_sd::Real)
# The molecular-clock TMRCA is a right-censored, noisy reading of the
# true seeding time T: deeper or wider sampling only pushes it older,
# so we learn the reading is at least `tmrca_days`, i.e. P(read ≥ g).
tmrca_days ~ censored(Normal(T, tmrca_days_sd); upper = tmrca_days)
return (; tmrca_days, tmrca_days_sd)
endgenetic_seeding_model (generic function with 2 methods)Observing tmrca_days) at the upper censoring point of
Onset-to-death delay
Following McCabe et al., we assume the symptom-onset-to-death delay is gamma distributed with shape
The McCabe et al. report uses the point estimate of (Rosello and others, 2015). We instead use the companion Bayesian reanalysis of the same Isiro line list (Funk and Abbott, 2026), which re-estimates the delay with uncertainty. We carry that uncertainty into the fit through truncated Normal priors centred on its estimates:
The delay estimation in that reanalysis follows the recommendations of (Charniga et al., 2024).
Submodel: delay_model
@model function delay_model(;
alpha_prior = truncated(Normal(4.3, 1.22); lower = 0),
theta_prior = truncated(Normal(2.6, 0.82); lower = 0))
α ~ alpha_prior
θ ~ theta_prior
return (; α, θ, dist = Gamma(α, θ))
enddelay_model (generic function with 2 methods)Case-fatality ratio
The US Centers for Disease Control and Prevention (CDC) summary for the two previous BVD outbreaks is
with mean
Submodel: cfr_model
@model function cfr_model(; cfr_prior = Beta(6.6, 13.4))
CFR ~ cfr_prior
return (; CFR)
endcfr_model (generic function with 2 methods)The prior density, with the CDC
Detection window
is the mean time during which a case is still infectious and detectable abroad (incubation + onset-to-detection). The prior is based on the detection windows McCabe et al. sweep in their Method 1 scenarios (10, 15 and 20 days): it is centred on their central 15-day value with an SD wide enough to cover the 10–20 day range.
Submodel: detection_window_model
@model function detection_window_model(;
window_prior = truncated(Normal(15.0, 5.0); lower = 0))
w ~ window_prior
return (; w)
enddetection_window_model (generic function with 2 methods)Daily traveller volume
The number of people crossing from the source area to Uganda each day sets the travel rate in the exports likelihood. We treat it as an estimated quantity rather than a fixed input. McCabe et al. Table 3 records mean weekly passenger counts across seven points of entry; the Ituri-side daily total of
Submodel: traveller volume
@model function traveller_volume_model(;
mean::Real = ITURI_DAILY_TRAVEL,
sd::Real = ITURI_DAILY_TRAVEL_SD)
daily_travellers ~ truncated(Normal(mean, sd); lower = 0)
return (; daily_travellers)
endtraveller_volume_model (generic function with 2 methods)Surveillance dispersion
We assume the passive-surveillance counts are reported with negative binomial observation error around their expected value, using the same error model for both streams it applies to — suspected deaths and reported cases in the DRC — with a single shared dispersion
The dispersion captures passive-surveillance noise (under-reporting that varies by district, weekend reporting effects, and batched updates), not transmission heterogeneity. We judge this noise to be substantial, so a priori we expect meaningful overdispersion rather than near-Poisson counts. Following the Stan prior-choice recommendations (Stan Development Team, 2024), the dispersion is sampled on the
giving
Submodel: surveillance_dispersion_model
@model function surveillance_dispersion_model(;
inv_sqrt_k_prior = truncated(Normal(0.6, 0.2); lower = 0))
inv_sqrt_k ~ inv_sqrt_k_prior
k := 1.0 / (inv_sqrt_k^2 + eps(typeof(inv_sqrt_k)))
return (; k, inv_sqrt_k)
endsurveillance_dispersion_model (generic function with 2 methods)Ascertainment — partial pooling between DRC and Uganda
Two surveillance systems detect cases: DRC passive surveillance (the reported suspected-case count) and Uganda's point-of-entry / hospital surveillance (the exported-case count). Each captures only a fraction of the true cases passing through it, and each fraction is informed by essentially a single aggregate data point — the one reported-case total and the one export count — so neither is well identified on its own. We therefore centre the prior on an assumed reporting fraction of 25% and partially pool the two fractions so they share strength: treating them as identical would conflate two different systems, while treating them as independent would leave the Uganda fraction almost wholly prior-driven.
Both ascertainment fractions
with
We sample this prior in its non-centred form: draw offsets
Submodel: pooled_ascertainment_model
@model function pooled_ascertainment_model(;
mu_prior = Normal(logit(0.25), 1.0),
tau_prior = truncated(Normal(0, 0.5); lower = 1e-4))
μ_logit ~ mu_prior
τ_logit ~ tau_prior
z_drc ~ Normal(0, 1)
z_uganda ~ Normal(0, 1)
logit_p_drc = μ_logit + τ_logit * z_drc
logit_p_uganda = μ_logit + τ_logit * z_uganda
p_drc := logistic(logit_p_drc)
p_uganda := logistic(logit_p_uganda)
return (; μ_logit, τ_logit, p_drc, p_uganda)
endpooled_ascertainment_model (generic function with 2 methods)We model both the deaths and the reported cases with a negative binomial, so we define one small constructor for it and share it. It is parameterised by mean
Function: safe_nbinomial
function safe_nbinomial(k, μ)
p_raw = k / (k + max(μ, eps(typeof(μ))))
p = isfinite(p_raw) ?
clamp(p_raw, eps(typeof(k)), one(k) - eps(typeof(k))) :
eps(typeof(k))
return NegativeBinomial(k, p)
endsafe_nbinomial (generic function with 1 method)Observation submodels
With the building blocks in place, each observation submodel takes the growth state as input, introduces the forward integral it needs, and ties one data stream to the latent
Exports — Method 1 (geographic spread)
Each case in the source population travels to Uganda on any given day with probability
Splitting at
which integration by parts collapses to the at-risk person-time form using the cumulative-incidence trajectory
For exponential growth this evaluates to
Comparison with McCabe et al. / Imai 2020
McCabe et al. use the small-
Submodel: exports_model
@model function exports_model(
exported_cases::Union{Missing, Integer},
growth_state, p_uganda::Real;
source_population::Real = ITURI_POPULATION,
window = detection_window_model(),
traveller = traveller_volume_model())
cumulative = growth_state.cumulative
T = growth_state.T
window_state ~ to_submodel(window, false)
w = window_state.w
travel_state ~ to_submodel(traveller, false)
daily_travellers = travel_state.daily_travellers
q = daily_travellers / source_population
expected_exports_T := expected_exports(cumulative, p_uganda, q, T, w)
exported_cases ~ Poisson(expected_exports_T)
return (; w, daily_travellers, p_uganda,
expected_exports = expected_exports_T)
endexports_model (generic function with 2 methods)Deaths — Method 2 (back-calculation from deaths)
Expected cumulative deaths by
For a gamma delay this integral has an exact closed form carrying a
Submodel: deaths_model
@model function deaths_model(
total_deaths::Union{Missing, Integer},
growth_state, k::Real;
delay = delay_model(),
cfr = cfr_model())
C_T = growth_state.C_T
r = growth_state.r
T = growth_state.T
delay_state ~ to_submodel(delay, false)
cfr_state ~ to_submodel(cfr, false)
CFR = cfr_state.CFR
# NaN-safe clamp: extreme NUTS proposals during warmup can push
# the expected count to NaN / Inf.
raw_deaths = expected_deaths(CFR, r, T, delay_state.dist)
expected_deaths_T := isfinite(raw_deaths) ?
max(raw_deaths, eps(typeof(raw_deaths))) :
eps(typeof(raw_deaths))
total_deaths ~ safe_nbinomial(k, expected_deaths_T)
return (; CFR, delay_dist = delay_state.dist, expected_deaths_T)
enddeaths_model (generic function with 2 methods)Cases — ascertainment extension
Methods 1 and 2 use exports and deaths only. Reported suspected cases from the same passive-surveillance system carry information about
The dispersion
Submodel: cases_model
@model function cases_model(
reported_cases::Union{Missing, Integer},
growth_state, k::Real, p_drc::Real)
C_T = growth_state.C_T
raw_reports = p_drc * C_T
expected_reports := isfinite(raw_reports) ?
max(raw_reports, eps(typeof(raw_reports))) :
eps(typeof(raw_reports))
reported_cases ~ safe_nbinomial(k, expected_reports)
return (; p_drc, expected_reports)
endcases_model (generic function with 2 methods)Deaths among exports
Uganda reports a single death among its detected exports. That count is informative. If the exports happened long ago, we would expect more deaths among them by now under the onset-to-death gamma, so the observed deaths-among-exports bound how recently the exports occurred and help constrain
Equation (19) is evaluated numerically, writing
Uganda's export deaths are point-of-entry / hospital-detected, so their dates are recorded directly and carry information beyond the total count: a death seen early bounds how old the outbreak can be. We model the detected export deaths as an inhomogeneous Poisson process with cumulative intensity
The continuous term collapses the long pre-death zero stretch into a single weight, so there is no per-day vector of zeros, while the recent window is resolved per day. The number of daily bins is fixed by the earliest death's date, so the likelihood is well defined even though
Selection-bias caveat
This assumes Uganda's surveillance retains detected exports through to any subsequent death. If the system instead loses cases that progress to death, the observed deaths-among-exports count is selection-biased downward and the constraint it places on
Submodel: exports_deaths_model
@model function exports_deaths_model(
export_deaths_daily::AbstractVector,
growth_state, CFR::Real, delay_dist, p_uganda::Real;
pre_start_deaths::Union{Missing, Integer} = 0,
window::Real,
daily_travellers::Real,
source_population::Real = ITURI_POPULATION)
cumulative = growth_state.cumulative
T = growth_state.T
q = daily_travellers / source_population
n = length(export_deaths_daily) # days from earliest death to cut-off
# Precompute the onset-to-death CDF once and reuse it across every
# bin edge below (`T - s ≤ window` over the domain; see
# `ExportDeathDelay`).
delay = ExportDeathDelay(delay_dist, window)
Λ(t) = expected_exports_deaths(
cumulative, delay, CFR, p_uganda, q, t, window)
# Pre-death zero stretch as one Poisson observed at 0; `missing`
# generates it for predictive checks (see equation (20)).
pre = T - n > zero(T) ? Λ(T - n) : zero(T)
pre_start_deaths ~ Poisson(max(pre, zero(pre)))
# Carry the upper edge forward so each Λ is evaluated once.
λlo = pre
for i in 1:n
λhi = Λ(T - n + i)
μ_day = max(λhi - λlo, eps(typeof(λhi)))
export_deaths_daily[i] ~ Poisson(μ_day)
λlo = λhi
end
return (;)
endexports_deaths_model (generic function with 2 methods)First export detection — timing survival term
The same logic applies to the first detected export case (Uganda's first hospital admission), using the at-risk export person-time intensity
As with the export-death term, this is one-sided and only marginally constrains the posterior because the Uganda detections sit only days before the cut-off. Passing delta = missing makes the submodel a no-op.
Submodel: exports_detection_timing_model
@model function exports_detection_timing_model(
growth_state, p_uganda::Real;
delta::Union{Missing, Real},
pre_detection_exports::Union{Missing, Integer} = 0,
window::Real,
daily_travellers::Real,
source_population::Real = ITURI_POPULATION)
if !ismissing(delta)
cumulative = growth_state.cumulative
T = growth_state.T
t1 = T - delta
q = daily_travellers / source_population
survived_exports := t1 <= zero(T) ? zero(T) :
expected_exports(cumulative, p_uganda, q, t1, window)
# No detection before t1 as a Poisson observed at 0; `missing`
# generates it for predictive checks (see equation (22)).
pre_detection_exports ~ Poisson(max(survived_exports, zero(T)))
end
return (;)
endexports_detection_timing_model (generic function with 2 methods)Composers
These composers combine the building blocks into the full model for each analysis. McCabe et al. invert a deterministic summary formula at fixed nuisance parameters; here we sample all of them — growth, delay, CFR, detection window, traveller volume, dispersion, ascertainment — and condition on the observed counts. Each composer conditionally includes only the likelihoods for the streams it carries, so a single-stream composer never instantiates the other observation submodels (and so a discrete stream is never left sampled). Of the observation streams, the geographic-spread exports reproduce McCabe et al.'s Method 1 and the back-calculation from deaths their Method 2; the reported-cases ascertainment, the deaths-among-exports, the export-detection-timing, and the genetic seeding terms are extensions with no counterpart in McCabe et al.
The joint composer samples a single dispersion scale
We write single-stream composers for the four count-based streams only. The export-detection-timing and genetic seeding terms constrain the outbreak start
Exports-only fit — Method 1 analogue
Composer: exports-only fit
@model function exports_only_model(
exported_cases::Union{Missing, Integer};
growth = exponential_growth_model(),
exports = exports_model,
ascertainment = pooled_ascertainment_model())
growth_state ~ to_submodel(growth, false)
asc_state ~ to_submodel(ascertainment, false)
exports_state ~ to_submodel(
exports(exported_cases, growth_state, asc_state.p_uganda), false)
cumulative_cases := growth_state.C_T
endexports_only_model (generic function with 2 methods)Deaths-only fit — Method 2 analogue
Composer: deaths-only fit
@model function deaths_only_model(
total_deaths::Union{Missing, Integer};
growth = exponential_growth_model(),
deaths = deaths_model,
dispersion = surveillance_dispersion_model())
growth_state ~ to_submodel(growth, false)
dispersion_state ~ to_submodel(dispersion, false)
k = dispersion_state.k
deaths_state ~ to_submodel(
deaths(total_deaths, growth_state, k), false)
cumulative_cases := growth_state.C_T
enddeaths_only_model (generic function with 2 methods)Cases-only fit — ascertainment extension
Composer: cases-only fit
@model function cases_only_model(
reported_cases::Union{Missing, Integer};
growth = exponential_growth_model(),
cases = cases_model,
dispersion = surveillance_dispersion_model(),
ascertainment = pooled_ascertainment_model())
growth_state ~ to_submodel(growth, false)
dispersion_state ~ to_submodel(dispersion, false)
asc_state ~ to_submodel(ascertainment, false)
k = dispersion_state.k
cases_state ~ to_submodel(
cases(reported_cases, growth_state, k, asc_state.p_drc), false)
cumulative_cases := growth_state.C_T
endcases_only_model (generic function with 2 methods)Deaths-among-exports-only fit
Composer: exports-deaths-only fit
@model function exports_deaths_only_model(
export_deaths_daily::AbstractVector;
growth = exponential_growth_model(),
delay = delay_model(),
cfr = cfr_model(),
window = detection_window_model(),
traveller = traveller_volume_model(),
exports_deaths_model = exports_deaths_model,
ascertainment = pooled_ascertainment_model(),
source_population::Real = ITURI_POPULATION,
pre_start_deaths::Union{Missing, Integer} = 0)
growth_state ~ to_submodel(growth, false)
delay_state ~ to_submodel(delay, false)
cfr_state ~ to_submodel(cfr, false)
window_state ~ to_submodel(window, false)
asc_state ~ to_submodel(ascertainment, false)
travel_state ~ to_submodel(traveller, false)
daily_travellers = travel_state.daily_travellers
exports_deaths_state ~ to_submodel(
exports_deaths_model(export_deaths_daily, growth_state,
cfr_state.CFR, delay_state.dist, asc_state.p_uganda;
pre_start_deaths = pre_start_deaths,
window = window_state.w,
daily_travellers = daily_travellers,
source_population = source_population),
false)
cumulative_cases := growth_state.C_T
endexports_deaths_only_model (generic function with 2 methods)Joint fit — full posterior over from all data streams
The joint composer is the same generative process conditioned on all four observed streams simultaneously. Each stream argument may be passed as missing to drop it; the matching likelihood is then not instantiated, so the composer doubles as a generator (all streams missing) for the prior- and posterior-predictive checks.
Composer: joint fit
@model function bvd_joint(
exported_cases::Union{Missing, Integer},
total_deaths::Union{Missing, Integer},
reported_cases::Union{Missing, Integer} = missing,
export_deaths_daily::AbstractVector = Int[];
growth = exponential_growth_model(),
exports = exports_model,
deaths = deaths_model,
cases = cases_model,
exports_deaths_model = exports_deaths_model,
exports_detection_timing = exports_detection_timing_model,
dispersion = surveillance_dispersion_model(),
ascertainment = pooled_ascertainment_model(),
genetic = nothing,
source_population::Real = ITURI_POPULATION,
pre_start_deaths::Union{Missing, Integer} = 0,
pre_detection_exports::Union{Missing, Integer} = 0,
first_export_detection_delta::Union{Missing, Real} = missing)
growth_state ~ to_submodel(growth, false)
if genetic !== nothing
genetic_state ~ to_submodel(genetic(growth_state.T), false)
end
dispersion_state ~ to_submodel(dispersion, false)
asc_state ~ to_submodel(ascertainment, false)
k = dispersion_state.k
p_drc = asc_state.p_drc
p_uganda = asc_state.p_uganda
exports_state ~ to_submodel(
exports(exported_cases, growth_state, p_uganda), false)
deaths_state ~ to_submodel(
deaths(total_deaths, growth_state, k), false)
cases_state ~ to_submodel(
cases(reported_cases, growth_state, k, p_drc), false)
exports_deaths_state ~ to_submodel(
exports_deaths_model(export_deaths_daily, growth_state,
deaths_state.CFR, deaths_state.delay_dist, p_uganda;
pre_start_deaths = pre_start_deaths,
window = exports_state.w,
daily_travellers = exports_state.daily_travellers,
source_population = source_population),
false)
detection_timing_state ~ to_submodel(
exports_detection_timing(growth_state, p_uganda;
delta = first_export_detection_delta,
pre_detection_exports = pre_detection_exports,
window = exports_state.w,
daily_travellers = exports_state.daily_travellers,
source_population = source_population),
false)
cumulative_cases := growth_state.C_T
endbvd_joint (generic function with 6 methods)McCabe et al. reimplementation — exports and deaths only
McCabe et al.'s joint configuration uses exactly two data sources: the geographic-spread exports (Method 1) and the back-calculation from deaths (Method 2). It has no reported-cases ascertainment model and no deaths-among-exports likelihood. This composer wraps just those two observation submodels so the sense-check can fix the model down to exactly the McCabe et al. joint configuration. Either stream may be missing: passing missing for exports recovers a pure Method 2 (deaths-only) fit without instantiating the exports likelihood.
Composer: report reimplementation
@model function imperial_only_model(
exported_cases::Union{Missing, Integer},
total_deaths::Union{Missing, Integer};
growth = exponential_growth_model(),
exports = exports_model,
deaths = deaths_model,
dispersion = surveillance_dispersion_model(),
ascertainment = pooled_ascertainment_model())
growth_state ~ to_submodel(growth, false)
dispersion_state ~ to_submodel(dispersion, false)
asc_state ~ to_submodel(ascertainment, false)
k = dispersion_state.k
p_uganda = asc_state.p_uganda
if !ismissing(exported_cases)
exports_state ~ to_submodel(
exports(exported_cases, growth_state, p_uganda), false)
end
deaths_state ~ to_submodel(
deaths(total_deaths, growth_state, k), false)
cumulative_cases := growth_state.C_T
endimperial_only_model (generic function with 2 methods)Model fitting and evaluation
Prior predictive check
Before any observation is taken into account, what does the joint prior imply about replicated exports, deaths, reported cases and deaths among exports? Draws from the prior over the unobserved data should bracket the observed counts.
Sample the joint prior
prior_chn = sample(bvd_joint(missing, missing, missing, Int[]),
Prior(), 2_000; progress = false);
prior_C_table = summary_table(prior_chn, [:cumulative_cases]; digits = 0);| Row | Quantity | Lower 90% | Lower 60% | Lower 30% | Upper 30% | Upper 60% | Upper 90% |
|---|---|---|---|---|---|---|---|
| String | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | cumulative_cases | 8.0 | 32.0 | 67.0 | 253.0 | 509.0 | 1712.0 |
Pair plot of the prior over the latent quantities — useful for spotting prior correlations before any data has been seen.
Prior pair plot
prior_pair_fig = plot_pair(prior_chn,
[:τ, :m, :cumulative_cases, :CFR, :w, :inv_sqrt_k, :k,
:p_drc, :p_uganda, :τ_logit]);Fitting the models
NUTS (Hoffman and Gelman, 2014) with Mooncake (Hancock et al., 2024) reverse-mode automatic differentiation, four chains, 1000 post-warmup draws each, with a target acceptance probability of 0.95. Chains initialise from the prior to keep the sampler away from the boundary of
Run the joint and per-stream NUTS fits
genetic_seeding = T -> genetic_seeding_model(T, obs.genetic_tmrca_days;
tmrca_days_sd = obs.genetic_tmrca_days_sd)
chn_joint = nuts_sample(
bvd_joint(obs.exported_cases, obs.total_deaths,
obs.reported_cases, obs.export_deaths_daily;
first_export_detection_delta = obs.first_export_detection_delta,
genetic = genetic_seeding));
chn_exports = nuts_sample(exports_only_model(obs.exported_cases));
chn_deaths = nuts_sample(deaths_only_model(obs.total_deaths));
chn_cases = nuts_sample(cases_only_model(obs.reported_cases));
chn_exports_deaths = nuts_sample(
exports_deaths_only_model(obs.export_deaths_daily));
posterior_C_joint = vec(Array(chn_joint[:cumulative_cases]));
posterior_C_exports = vec(Array(chn_exports[:cumulative_cases]));
posterior_C_deaths = vec(Array(chn_deaths[:cumulative_cases]));
posterior_C_cases = vec(Array(chn_cases[:cumulative_cases]));
posterior_C_exports_deaths =
vec(Array(chn_exports_deaths[:cumulative_cases]));┌ Warning: Only a single thread available: MCMC chains are not sampled in parallel
└ @ AbstractMCMC ~/.julia/packages/AbstractMCMC/C1aKp/src/sample.jl:544
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Info: Found initial step size
└ ϵ = 0.05
┌ Info: Found initial step size
└ ϵ = 0.05
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Warning: There were 2 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Warning: There were 2 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Warning: Only a single thread available: MCMC chains are not sampled in parallel
└ @ AbstractMCMC ~/.julia/packages/AbstractMCMC/C1aKp/src/sample.jl:544
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Info: Found initial step size
└ ϵ = 0.4
┌ Info: Found initial step size
└ ϵ = 0.20625000000000002
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Warning: Only a single thread available: MCMC chains are not sampled in parallel
└ @ AbstractMCMC ~/.julia/packages/AbstractMCMC/C1aKp/src/sample.jl:544
┌ Info: Found initial step size
└ ϵ = 0.8
┌ Warning: There were 1 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Warning: There were 2 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Warning: There were 2 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Info: Found initial step size
└ ϵ = 0.8
┌ Warning: There were 5 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Warning: Only a single thread available: MCMC chains are not sampled in parallel
└ @ AbstractMCMC ~/.julia/packages/AbstractMCMC/C1aKp/src/sample.jl:544
┌ Info: Found initial step size
└ ϵ = 0.4
┌ Warning: There were 1 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Info: Found initial step size
└ ϵ = 0.05
┌ Info: Found initial step size
└ ϵ = 0.05
┌ Warning: There were 6 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Info: Found initial step size
└ ϵ = 0.4
┌ Warning: There were 7 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Warning: Only a single thread available: MCMC chains are not sampled in parallel
└ @ AbstractMCMC ~/.julia/packages/AbstractMCMC/C1aKp/src/sample.jl:544
┌ Info: Found initial step size
└ ϵ = 0.4
┌ Info: Found initial step size
└ ϵ = 0.4
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Info: Found initial step size
└ ϵ = 0.2Fit diagnostics
Fit-quality diagnostics for the joint and per-stream fits: the worst R-hat, the smallest bulk effective sample size, and the number of divergent transitions.
Fit diagnostics
| Row | fit | max_rhat | min_ess_bulk | divergences |
|---|---|---|---|---|
| String | Float64 | Float64 | Int64 | |
| 1 | joint | 1.002 | 2367.0 | 2 |
| 2 | exports (cases) | 1.001 | 2719.0 | 0 |
| 3 | exports (deaths) | 1.004 | 2652.0 | 0 |
| 4 | deaths (DRC) | 1.001 | 1940.0 | 5 |
| 5 | cases (DRC) | 1.002 | 1571.0 | 7 |
Additional analyses
On top of the main analysis we run four supporting analyses.
Counterfactual: no-onward-transmission lower bound
Suppose every onward transmission stopped at the report date — the estimation / report-generation time, which we denote
and a lower bound on the cumulative-death endpoint of
One-week-ahead forecast
If the fitted model is taken at face value, what should the next week's situation reports show? We continue the fitted exponential growth seven days past the report date
This assumes growth continues unchanged for the week — no interventions and no saturation — so it is a "no-change" projection.
Delay sensitivity
The deaths back-calculation (equation (16)) depends on the onset-to- death delay. The baseline fit anchors the gamma shape
We refit the joint model once with the onset-to-death delay priors re-anchored on the community-only pathway, building truncated-Normal priors from those credible intervals exactly as the baseline delay priors (equation (5)) are constructed:
The comparison shows how sensitive the outbreak-size estimate is to the delay assumption.
Sensitivity only, not a preferred estimate
The community-only delay is fitted from
Report reproduction and validation
How does our joint posterior sit against what McCabe et al. reported, and how much of any difference is the method rather than the newer data? The report itself was revised: the 18 May version (McCabe and others, May 2026) used data/report-snapshot.toml for 18 May, report-snapshot-20may.toml for 20 May), so the only thing that changes between a version's fit and our headline fit is the data.
McCabe et al. Method 2 reports Poisson intervals (no overdispersion,
As a sense check we ask whether our machinery recovers McCabe et al.'s Method 2 headline when given their inputs. The 18 May reported Method 2 central estimate is Turing.fix-pins the Method 2 main-scenario values (
This sense check covers the deaths (Method 2) side. The exports (Method 1) side differs by construction: we use the exact cumulative integral
Results
Summary
For the response the question that matters is how many people have already been infected: the reported counts capture only part of the outbreak, and planning for beds, contacts and vaccine needs depends on the true total. The numbers below are our current best estimate of that total, computed from the joint posterior and refreshed on every build. For each headline number we give the equal-tailed 30%, 60% and 90% credible intervals; the same intervals appear in the tables below.
Compute the headline ranges
summary_ranges = let
med(x) = quantile(x, 0.5)
iqr(x) = quantile(x, 0.75) - quantile(x, 0.25)
# Posterior-minus-prior shift in units of the parameter's prior
# IQR, reusing the prior draws so nothing is respecified here.
shift(post, prior) =
round((med(post) - med(prior)) / iqr(prior); digits = 2)
ints_i(s) = string(
"30% ", round(Int, s.lo30), "–", round(Int, s.hi30),
", 60% ", round(Int, s.lo60), "–", round(Int, s.hi60),
", 90% ", round(Int, s.lo90), "–", round(Int, s.hi90))
ints_f(s, d) = string(
"30% ", round(s.lo30; digits = d), "–", round(s.hi30; digits = d),
", 60% ", round(s.lo60; digits = d), "–", round(s.hi60; digits = d),
", 90% ", round(s.lo90; digits = d), "–", round(s.hi90; digits = d))
C = posterior_C_joint
Td = vec(Array(chn_joint[:T]))
τd = vec(Array(chn_joint[:τ]))
rd = vec(Array(chn_joint[:r]))
sC = posterior_summary(C)
sT = posterior_summary(Td)
sτ = posterior_summary(τd)
sr = posterior_summary(rd)
start_earliest = Date(obs.as_of_date) - Day(round(Int, sT.hi90))
start_latest = Date(obs.as_of_date) - Day(round(Int, sT.lo90))
f_lo = round(sC.lo90 / obs.reported_cases; digits = 1)
f_hi = round(sC.hi90 / obs.reported_cases; digits = 1)
moves = ["cumulative case load" => shift(C, vec(Array(
prior_chn[:cumulative_cases]))),
"time since seeding" => shift(Td, vec(Array(prior_chn[:T]))),
"doubling time" => shift(τd, vec(Array(prior_chn[:τ])))]
biggest = argmax(p -> abs(p.second), moves)
Markdown.parse("""
- **Current cumulative case load:** we estimate $(ints_i(sC)) cases,
combining all four data streams (reported and as-yet-unreported).
- That is roughly $(f_lo)–$(f_hi)× the $(obs.reported_cases) cases
reported to date, so most infections are not yet reported. This
multiplier is one over the DRC reporting fraction; see
[what the reporting fraction means](#Joint-model-estimates).
- **Time since seeding:** we estimate $(ints_i(sT)) days, placing the
start of sustained transmission between $(start_earliest) and
$(start_latest).
- **Doubling time and growth rate:** we estimate a doubling time of
$(ints_f(sτ, 1)) days, and an implied growth rate of
$(ints_f(sr, 3)) per day.
- **Shift from priors:** how far the data has moved each estimate
from its prior, measured in prior interquartile ranges (IQRs) — a
value of 1 means the posterior median sits one prior IQR from the
prior median, 0 means unchanged, and the sign gives the direction.
The fit moves the cumulative case load by $(moves[1].second), the
time since seeding by $(moves[2].second) and the doubling time by
$(moves[3].second); the largest move is in the $(biggest.first).
""")
end;Current cumulative case load: we estimate 30% 765–1095, 60% 628–1378, 90% 438–2234 cases, combining all four data streams (reported and as-yet-unreported).
That is roughly 0.8–4.3× the 516 cases reported to date, so most infections are not yet reported. This multiplier is one over the DRC reporting fraction; see what the reporting fraction means.
Time since seeding: we estimate 30% 101–134, 60% 86–161, 90% 67–216 days, placing the start of sustained transmission between 2025-10-14 and 2026-03-12.
Doubling time and growth rate: we estimate a doubling time of 30% 10.2–13.8, 60% 8.7–16.6, 90% 6.7–22.1 days, and an implied growth rate of 30% 0.05–0.068, 60% 0.042–0.08, 90% 0.031–0.104 per day.
Shift from priors: how far the data has moved each estimate from its prior, measured in prior interquartile ranges (IQRs) — a value of 1 means the posterior median sits one prior IQR from the prior median, 0 means unchanged, and the sign gives the direction. The fit moves the cumulative case load by 2.18, the time since seeding by 0.3 and the doubling time by -0.29; the largest move is in the cumulative case load.
Why our estimate is higher than McCabe et al. Our central estimate sits above the McCabe et al. (McCabe and others, May 2026) report mainly because we fit more data streams and average over the uncertainty in the parameters the report fixes one scenario at a time, and because we use more recent counts. See what we do differently, the comparison with McCabe et al. and the limitations for the detail behind this.
Joint model estimates
Our main result is an estimate of the current cumulative case load — both reported and unreported cases — at the report date. It is the joint posterior over the cumulative case count, obtained by fitting all four data streams together: the cases exported to Uganda, the suspected deaths in the DRC, the reported cases in the DRC (with an ascertainment component) and the deaths among exported cases in Uganda.
We report the cumulative case count first as a credible-interval table and then as a posterior density.
Cumulative case count summary table
cumulative_cases_summary = summary_table(
chn_joint, [:cumulative_cases]; digits = 0);| Row | Quantity | Lower 90% | Lower 60% | Lower 30% | Upper 30% | Upper 60% | Upper 90% |
|---|---|---|---|---|---|---|---|
| String | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | cumulative_cases | 438.0 | 628.0 | 765.0 | 1095.0 | 1378.0 | 2234.0 |
Cumulative case count density
joint_density_fig = plot_cumulative_cases(
"joint" => posterior_C_joint; scenarios = []);The cumulative case count
Outbreak start date and (τ, T) posterior
start_date_fig = plot_start_date_pair(chn_joint;
as_of_date = obs.as_of_date);The full posterior summary table reports equal-tailed 30%, 60% and 90% credible intervals on the key joint-fit parameters: growth rate
Joint posterior summary table
joint_summary = summary_table(chn_joint,
[:r, :m, :T, :CFR, :p_drc, :p_uganda, :τ_logit,
:inv_sqrt_k, :k, :cumulative_cases]; digits = 2);| Row | Quantity | Lower 90% | Lower 60% | Lower 30% | Upper 30% | Upper 60% | Upper 90% |
|---|---|---|---|---|---|---|---|
| String | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | r | 0.03 | 0.04 | 0.05 | 0.07 | 0.08 | 0.1 |
| 2 | m | 8.77 | 9.29 | 9.58 | 10.1 | 10.43 | 11.13 |
| 3 | T | 67.28 | 86.18 | 100.64 | 134.44 | 161.45 | 216.06 |
| 4 | CFR | 0.18 | 0.25 | 0.29 | 0.37 | 0.42 | 0.5 |
| 5 | p_drc | 0.18 | 0.3 | 0.38 | 0.54 | 0.63 | 0.78 |
| 6 | p_uganda | 0.16 | 0.27 | 0.35 | 0.5 | 0.6 | 0.76 |
| 7 | τ_logit | 0.03 | 0.12 | 0.21 | 0.44 | 0.6 | 0.94 |
| 8 | inv_sqrt_k | 0.26 | 0.41 | 0.5 | 0.64 | 0.73 | 0.88 |
| 9 | k | 1.28 | 1.88 | 2.43 | 4.06 | 6.06 | 15.06 |
| 10 | cumulative_cases | 437.85 | 627.52 | 765.43 | 1095.49 | 1377.91 | 2233.58 |
The posterior pair plot shows the joint distribution of the key parameters, with the prior overlaid so the data's contribution to each marginal is visible.
Posterior pair plot (prior overlaid)
posterior_pair_fig = plot_pair(chn_joint,
[:τ, :m, :cumulative_cases, :CFR, :w, :inv_sqrt_k, :k,
:p_drc, :p_uganda, :τ_logit];
prior = prior_chn);The DRC reporting fraction
A posterior predictive check draws replicated observations from the fitted joint model and compares them to the observed counts. If the fit is reasonable the observed value (red line) sits inside the bulk of its replicate distribution. The four panels are the four data streams: exported cases and deaths among exports (Uganda), and deaths and reported cases (DRC).
Joint posterior predictive plot
pp_joint = predict(
bvd_joint(missing, missing, missing,
fill(missing, length(obs.export_deaths_daily));
pre_start_deaths = missing,
pre_detection_exports = missing,
first_export_detection_delta = obs.first_export_detection_delta,
genetic = genetic_seeding),
chn_joint);
pp_exports = vec(Array(pp_joint[:exported_cases]));
pp_deaths = vec(Array(pp_joint[:total_deaths]));
pp_cases = vec(Array(pp_joint[:reported_cases]));
# Export deaths are a per-day series held as one vector-valued variable
# in the FlexiChains predictive; sum each draw's daily counts for the
# total-count posterior predictive.
pp_exports_deaths = vec(sum.(pp_joint[@varname(export_deaths_daily)]));
joint_ppc_fig = plot_posterior_predictive(
pp_exports, pp_deaths,
obs.exported_cases, obs.total_deaths;
pp_cases = pp_cases,
obs_cases = obs.reported_cases,
pp_exports_deaths = pp_exports_deaths,
obs_exports_deaths = obs.exports_deaths);Counterfactual: lower bound under no further transmission
The lower bound on cumulative deaths if transmission stopped at the report date: still-expected and projected-total deaths per draw.
Project no-onward deaths and summarise
no_onward = predict_no_onward_deaths(chn_joint; obs_deaths = obs.total_deaths);
no_onward_table = streams_table(
"no-onward total" => no_onward.total_projected;
digits = 0);| Row | Stream | Lower 90% | Lower 60% | Lower 30% | Upper 30% | Upper 60% | Upper 90% |
|---|---|---|---|---|---|---|---|
| String | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | no-onward total | 159.0 | 188.0 | 213.0 | 278.0 | 341.0 | 510.0 |
Two panels: the still expected deaths
No-onward projected-deaths plot
no_onward_fig = plot_no_onward_deaths(no_onward; obs_deaths = obs.total_deaths);One-week-ahead forecast
The seven-day no-change projection: cumulative and new expected counts per stream by
Generate the one-week-ahead forecast
forecast = forecast_reported(chn_joint;
horizon = 7,
daily_travellers = ITURI_DAILY_TRAVEL,
source_population = ITURI_POPULATION,
obs_cases = obs.reported_cases,
obs_deaths = obs.total_deaths,
obs_exports = EXPORTED_CASES);
forecast_summary = forecast_table(forecast);| Row | Stream | Quantity | Lower 90% | Lower 60% | Lower 30% | Upper 30% | Upper 60% | Upper 90% |
|---|---|---|---|---|---|---|---|---|
| String | String | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | DRC reported cases | cumulative by T+7 | 120.0 | 289.0 | 417.0 | 711.0 | 949.0 | 1657.0 |
| 2 | DRC reported cases | new this week | 0.0 | 0.0 | 0.0 | 195.0 | 433.0 | 1141.0 |
| 3 | DRC deaths | cumulative by T+7 | 52.0 | 123.0 | 176.0 | 299.0 | 421.0 | 801.0 |
| 4 | DRC deaths | new this week | 0.0 | 0.0 | 45.0 | 168.0 | 290.0 | 670.0 |
| 5 | Uganda exports | cumulative by T+7 | 0.0 | 1.0 | 2.0 | 3.0 | 4.0 | 7.0 |
| 6 | Uganda exports | new this week | 0.0 | 0.0 | 0.0 | 1.0 | 2.0 | 5.0 |
New counts expected over the coming week, by stream:
One-week-ahead forecast plot
forecast_fig = plot_forecast(forecast);Forecast validation against later data
The one-week-ahead forecast above projects from the current fit, so it cannot yet be checked. We can instead validate the same machinery retrospectively: take our joint fit to the original McCabe et al. report's data, project each posterior draw forward to the current data cut-off, and compare the predicted cumulative counts against the counts observed since.
Fit the joint model to the original report's data and forecast it forward
obs_report = load_observations(
joinpath(pkgdir(BVDOutbreakSize), "data", "report-snapshot.toml"));
chn_joint_report = nuts_sample(
bvd_joint(obs_report.exported_cases, obs_report.total_deaths,
obs_report.reported_cases, obs_report.export_deaths_daily;
first_export_detection_delta =
obs_report.first_export_detection_delta));
posterior_C_joint_report =
vec(Array(chn_joint_report[:cumulative_cases]));
validation_horizon =
value(Date(obs.as_of_date) - Date(obs_report.as_of_date))
forecast_validation = forecast_reported(chn_joint_report;
horizon = validation_horizon,
daily_travellers = ITURI_DAILY_TRAVEL,
source_population = ITURI_POPULATION,
obs_cases = obs_report.reported_cases,
obs_deaths = obs_report.total_deaths,
obs_exports = obs_report.exported_cases);
forecast_validation_table = forecast_vs_truth(forecast_validation;
cases = obs.reported_cases,
deaths = obs.total_deaths,
exports = obs.exported_cases);┌ Warning: Only a single thread available: MCMC chains are not sampled in parallel
└ @ AbstractMCMC ~/.julia/packages/AbstractMCMC/C1aKp/src/sample.jl:544
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Warning: There were 1 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Warning: There were 1 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Warning: There were 2 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483Projecting the original-report fit 2 days forward to the current (2026-05-18) data, against the counts observed by then:
| Row | Stream | Observed | Lower 90% | Lower 60% | Lower 30% | Upper 30% | Upper 60% | Upper 90% | Within 90% PI |
|---|---|---|---|---|---|---|---|---|---|
| String | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | String | |
| 1 | DRC reported cases | 516.0 | 67.0 | 153.0 | 224.0 | 378.0 | 494.0 | 838.0 | yes |
| 2 | DRC deaths | 131.0 | 28.0 | 64.0 | 92.0 | 160.0 | 221.0 | 432.0 | yes |
| 3 | Uganda exports | 2.0 | 0.0 | 0.0 | 1.0 | 2.0 | 3.0 | 4.0 | yes |
The top row shows the cumulative forecast per stream and the bottom row the new counts over the horizon, mirroring the one-week-ahead forecast. Each panel shades the 90% predictive interval and draws the later-observed count as a dashed rule, so coverage can be read off directly.
Forecast-validation plot
forecast_validation_fig = plot_forecast_vs_truth(forecast_validation;
cases = obs.reported_cases,
deaths = obs.total_deaths,
exports = obs.exported_cases,
baseline_cases = obs_report.reported_cases,
baseline_deaths = obs_report.total_deaths,
baseline_exports = obs_report.exported_cases);Delay sensitivity
Refit under the community-only onset-to-death delay: the baseline and re-anchored
Refit the joint model with the community-only delay
community_delay = delay_model(;
alpha_prior = truncated(Normal(5.6, (25.9 - 1.0) / 3.92); lower = 0),
theta_prior = truncated(Normal(1.4, (9.5 - 0.3) / 3.92); lower = 0))
chn_joint_community = nuts_sample(
bvd_joint(obs.exported_cases, obs.total_deaths,
obs.reported_cases, obs.export_deaths_daily;
first_export_detection_delta = obs.first_export_detection_delta,
genetic = genetic_seeding,
deaths = (total_deaths, growth_state, k) ->
deaths_model(total_deaths, growth_state, k;
delay = community_delay)));
posterior_C_community = vec(Array(chn_joint_community[:cumulative_cases]));┌ Warning: Only a single thread available: MCMC chains are not sampled in parallel
└ @ AbstractMCMC ~/.julia/packages/AbstractMCMC/C1aKp/src/sample.jl:544
┌ Info: Found initial step size
└ ϵ = 0.4
┌ Info: Found initial step size
└ ϵ = 0.05
┌ Info: Found initial step size
└ ϵ = 0.05
┌ Warning: There were 1 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Warning: There were 1 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483Fit diagnostics for the community-only delay refit.
Fit diagnostics
| Row | fit | max_rhat | min_ess_bulk | divergences |
|---|---|---|---|---|
| String | Float64 | Float64 | Int64 | |
| 1 | joint (community-only delay) | 1.001 | 2321.0 | 1 |
Baseline versus community-only delay, side by side:
Delay-sensitivity C_T table
delay_sensitivity_table = streams_table(
"joint (baseline delay)" => posterior_C_joint,
"joint (community-only delay)" => posterior_C_community);| Row | Stream | Lower 90% | Lower 60% | Lower 30% | Upper 30% | Upper 60% | Upper 90% |
|---|---|---|---|---|---|---|---|
| String | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | joint (baseline delay) | 438.0 | 628.0 | 765.0 | 1095.0 | 1378.0 | 2234.0 |
| 2 | joint (community-only delay) | 412.0 | 575.0 | 704.0 | 1014.0 | 1273.0 | 2027.0 |
Overlaid posterior densities of
Delay-sensitivity C_T density plot
delay_sensitivity_fig = plot_cumulative_cases(
"baseline delay" => posterior_C_joint,
"community-only delay" => posterior_C_community;
scenarios = []);Clock-rate sensitivity
The main analysis fixes the molecular clock to the
Refit the joint model under the 1.9e-3 clock
genetic_seeding_clock19 = T -> genetic_seeding_model(T,
obs.genetic_tmrca_alt_days;
tmrca_days_sd = obs.genetic_tmrca_alt_days_sd)
chn_joint_clock19 = nuts_sample(
bvd_joint(obs.exported_cases, obs.total_deaths,
obs.reported_cases, obs.export_deaths_daily;
first_export_detection_delta = obs.first_export_detection_delta,
genetic = genetic_seeding_clock19));
posterior_C_clock19 = vec(Array(chn_joint_clock19[:cumulative_cases]));┌ Warning: Only a single thread available: MCMC chains are not sampled in parallel
└ @ AbstractMCMC ~/.julia/packages/AbstractMCMC/C1aKp/src/sample.jl:544
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Warning: There were 2 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Info: Found initial step size
└ ϵ = 0.05
┌ Warning: There were 32 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Info: Found initial step size
└ ϵ = 0.05
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Warning: There were 3 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483
┌ Warning: There were 37 divergent transitions. Consider reparameterising your model or using a smaller step size. For adaptive samplers such as NUTS and HMCDA, consider increasing `target_accept`.
└ @ Turing.Inference ~/.julia/packages/Turing/4hMHm/src/mcmc/hmc.jl:483Fit diagnostics for the 1.9e-3 clock-rate refit.
Fit diagnostics
| Row | fit | max_rhat | min_ess_bulk | divergences |
|---|---|---|---|---|
| String | Float64 | Float64 | Int64 | |
| 1 | joint (1.9e-3 clock) | 1.006 | 552.0 | 37 |
Each quantity is shown as a side-by-side table followed by overlaid posterior densities under the two clock rates.
Outbreak size
Clock-rate C_T table
clock_sensitivity_C_table = streams_table(
"joint (1.2e-3 clock)" => posterior_C_joint,
"joint (1.9e-3 clock)" => posterior_C_clock19);| Row | Stream | Lower 90% | Lower 60% | Lower 30% | Upper 30% | Upper 60% | Upper 90% |
|---|---|---|---|---|---|---|---|
| String | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | joint (1.2e-3 clock) | 438.0 | 628.0 | 765.0 | 1095.0 | 1378.0 | 2234.0 |
| 2 | joint (1.9e-3 clock) | 442.0 | 623.0 | 756.0 | 1070.0 | 1341.0 | 2069.0 |
Clock-rate C_T density plot
clock_sensitivity_C_fig = plot_cumulative_cases(
"1.2e-3 clock" => posterior_C_joint,
"1.9e-3 clock" => posterior_C_clock19;
scenarios = []);Seeding time
T_clock12 = vec(Array(chn_joint[:T]));
T_clock19 = vec(Array(chn_joint_clock19[:T]));Clock-rate seeding-time table
clock_sensitivity_T_table = streams_table(
"joint (1.2e-3 clock)" => T_clock12,
"joint (1.9e-3 clock)" => T_clock19;
digits = 0);| Row | Stream | Lower 90% | Lower 60% | Lower 30% | Upper 30% | Upper 60% | Upper 90% |
|---|---|---|---|---|---|---|---|
| String | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | joint (1.2e-3 clock) | 67.0 | 86.0 | 101.0 | 134.0 | 161.0 | 216.0 |
| 2 | joint (1.9e-3 clock) | 61.0 | 82.0 | 97.0 | 130.0 | 156.0 | 216.0 |
Clock-rate seeding-time density plot
clock_sensitivity_T_fig = plot_density_overlay(
"1.2e-3 clock" => T_clock12,
"1.9e-3 clock" => T_clock19;
xlabel = "Seeding time T (days before cut-off)",
title = "Posterior seeding time by clock rate");Growth rate
r_clock12 = vec(Array(chn_joint[:r]));
r_clock19 = vec(Array(chn_joint_clock19[:r]));Clock-rate growth-rate table
clock_sensitivity_r_table = streams_table(
"joint (1.2e-3 clock)" => r_clock12,
"joint (1.9e-3 clock)" => r_clock19;
digits = 3);| Row | Stream | Lower 90% | Lower 60% | Lower 30% | Upper 30% | Upper 60% | Upper 90% |
|---|---|---|---|---|---|---|---|
| String | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | joint (1.2e-3 clock) | 0.031 | 0.042 | 0.05 | 0.068 | 0.08 | 0.104 |
| 2 | joint (1.9e-3 clock) | 0.031 | 0.043 | 0.052 | 0.07 | 0.084 | 0.113 |
Clock-rate growth-rate density plot
clock_sensitivity_r_fig = plot_density_overlay(
"1.2e-3 clock" => r_clock12,
"1.9e-3 clock" => r_clock19;
xlabel = "Growth rate r (per day)",
title = "Posterior growth rate by clock rate");How the data streams compare
Each data stream constrains the latent outbreak size differently. The table below puts the posteriors over
Per-stream C_T table
streams_C_table = streams_table(
"exports (cases)" => posterior_C_exports,
"exports (deaths)" => posterior_C_exports_deaths,
"deaths (DRC)" => posterior_C_deaths,
"cases (DRC)" => posterior_C_cases,
"joint" => posterior_C_joint);| Row | Stream | Lower 90% | Lower 60% | Lower 30% | Upper 30% | Upper 60% | Upper 90% |
|---|---|---|---|---|---|---|---|
| String | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | exports (cases) | 145.0 | 335.0 | 516.0 | 1077.0 | 1655.0 | 3314.0 |
| 2 | exports (deaths) | 86.0 | 277.0 | 527.0 | 1482.0 | 2477.0 | 5415.0 |
| 3 | deaths (DRC) | 219.0 | 351.0 | 471.0 | 758.0 | 1049.0 | 1984.0 |
| 4 | cases (DRC) | 435.0 | 713.0 | 985.0 | 1691.0 | 2455.0 | 4421.0 |
| 5 | joint | 438.0 | 628.0 | 765.0 | 1095.0 | 1378.0 | 2234.0 |
The 2×4 grid below has replicates from the per-stream fits on the top row and the joint fit on the bottom row, comparable column-wise so it is easy to see what each per-stream fit constrains and how the joint combination shifts the predictives.
Per-stream vs joint posterior predictive grid
pp_exports_only = vec(Array(predict(
exports_only_model(missing), chn_exports)[:exported_cases]));
pp_deaths_only = vec(Array(predict(
deaths_only_model(missing), chn_deaths )[:total_deaths]));
pp_cases_only = vec(Array(predict(
cases_only_model(missing), chn_cases )[:reported_cases]));
pp_exports_deaths_only = vec(sum.(predict(
exports_deaths_only_model(fill(missing, length(obs.export_deaths_daily));
pre_start_deaths = missing),
chn_exports_deaths)[@varname(export_deaths_daily)]));
ppc_grid_fig = plot_posterior_predictive_grid(;
individual = (; exports = pp_exports_only,
exports_deaths = pp_exports_deaths_only,
deaths = pp_deaths_only,
cases = pp_cases_only),
joint = (; exports = pp_exports,
exports_deaths = pp_exports_deaths,
deaths = pp_deaths,
cases = pp_cases),
observed = (; exports = obs.exported_cases,
exports_deaths = obs.exports_deaths,
deaths = obs.total_deaths,
cases = obs.reported_cases),
);Overlaid posterior densities of
Overlaid C_T density plot
cumulative_density_fig = plot_cumulative_cases(
"exports (cases)" => posterior_C_exports,
"exports (deaths)" => posterior_C_exports_deaths,
"deaths (DRC)" => posterior_C_deaths,
"cases (DRC)" => posterior_C_cases,
"joint" => posterior_C_joint;
scenarios = []);Comparison with McCabe et al.
Our joint fit against the McCabe et al. estimates and our Method 2 reproduction: point estimates with 90% intervals. We step through both report versions in turn — the 18 May version (McCabe and others, May 2026) and the 20 May version (McCabe and others, May 2026) — and end with our joint fit to the current data.
Fit our model to each report version's data, and run the Method 2 reproductions
# `obs_report` and `chn_joint_report` (the 18 May report's data) are
# loaded and fitted earlier, in the forecast-validation section.
obs_report_20may = load_observations(
joinpath(pkgdir(BVDOutbreakSize), "data",
"report-snapshot-20may.toml"));
chn_joint_report_20may = nuts_sample(
bvd_joint(obs_report_20may.exported_cases,
obs_report_20may.total_deaths,
obs_report_20may.reported_cases,
obs_report_20may.export_deaths_daily;
first_export_detection_delta =
obs_report_20may.first_export_detection_delta));
posterior_C_joint_report_20may =
vec(Array(chn_joint_report_20may[:cumulative_cases]));
imperial_fixed = Turing.fix(
imperial_only_model(missing, 88), # exports missing → pure Method 2
(τ = 14.0, CFR = 0.30, α = 4.42, θ = 1/0.388,
inv_sqrt_k = 1e-3),
)
chn_imperial = nuts_sample(imperial_fixed);
posterior_C_imperial = vec(Array(chn_imperial[:cumulative_cases]));
imperial_fixed_20may = Turing.fix(
imperial_only_model(missing, 131), # exports missing → pure Method 2
(τ = 14.0, CFR = 0.33, α = 4.42, θ = 1/0.388,
inv_sqrt_k = 1e-3),
)
chn_imperial_20may = nuts_sample(imperial_fixed_20may);
posterior_C_imperial_20may =
vec(Array(chn_imperial_20may[:cumulative_cases]));┌ Warning: Only a single thread available: MCMC chains are not sampled in parallel
└ @ AbstractMCMC ~/.julia/packages/AbstractMCMC/C1aKp/src/sample.jl:544
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Info: Found initial step size
└ ϵ = 0.05
┌ Info: Found initial step size
└ ϵ = 0.05
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Warning: Only a single thread available: MCMC chains are not sampled in parallel
└ @ AbstractMCMC ~/.julia/packages/AbstractMCMC/C1aKp/src/sample.jl:544
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Warning: Only a single thread available: MCMC chains are not sampled in parallel
└ @ AbstractMCMC ~/.julia/packages/AbstractMCMC/C1aKp/src/sample.jl:544
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Info: Found initial step size
└ ϵ = 0.2
┌ Info: Found initial step size
└ ϵ = 0.2The plot places each estimate of
Build the comparison
joint_C_credibles = posterior_summary(posterior_C_joint)
joint_report_C_credibles = posterior_summary(posterior_C_joint_report)
joint_report_20may_C_credibles =
posterior_summary(posterior_C_joint_report_20may)
imperial_C_credibles = posterior_summary(posterior_C_imperial)
imperial_20may_C_credibles =
posterior_summary(posterior_C_imperial_20may)
comparison_rows = [
("18 May: McCabe Method 1 (Ituri, w=15 d)", 313, 39, 870),
("18 May: McCabe Method 2 (τ=14 d, CFR 30%)", 501, 402, 612),
("18 May: Our Method 2 reproduction",
round(Int, quantile(posterior_C_imperial, 0.5)),
round(Int, imperial_C_credibles.lo90),
round(Int, imperial_C_credibles.hi90)),
("18 May: Our joint (report data)",
round(Int, quantile(posterior_C_joint_report, 0.5)),
round(Int, joint_report_C_credibles.lo90),
round(Int, joint_report_C_credibles.hi90)),
("20 May: McCabe Method 1 (Ituri, w=15 d)", 313, 39, 870),
("20 May: McCabe Method 2 (τ=14 d, CFR 33%)", 678, 568, 800),
("20 May: Our Method 2 reproduction",
round(Int, quantile(posterior_C_imperial_20may, 0.5)),
round(Int, imperial_20may_C_credibles.lo90),
round(Int, imperial_20may_C_credibles.hi90)),
("20 May: Our joint (report data)",
round(Int, quantile(posterior_C_joint_report_20may, 0.5)),
round(Int, joint_report_20may_C_credibles.lo90),
round(Int, joint_report_20may_C_credibles.hi90)),
("Our joint (current data)",
round(Int, quantile(posterior_C_joint, 0.5)),
round(Int, joint_C_credibles.lo90),
round(Int, joint_C_credibles.hi90)),
]
comparison_fig = plot_estimate_comparison(comparison_rows);Fit diagnostics for the two report-data joint fits and the two Method 2 reproductions.
Fit diagnostics
| Row | fit | max_rhat | min_ess_bulk | divergences |
|---|---|---|---|---|
| String | Float64 | Float64 | Int64 | |
| 1 | joint (18 May report) | 1.004 | 2245.0 | 2 |
| 2 | joint (20 May report) | 1.002 | 2504.0 | 0 |
| 3 | Method 2 reproduction (18 May) | NaN | NaN | 0 |
| 4 | Method 2 reproduction (20 May) | NaN | NaN | 0 |
The same comparison as a table, with a column for the report version:
Comparison table
comparison_version = [
"18 May", "18 May", "18 May", "18 May",
"20 May", "20 May", "20 May", "20 May",
"current",
]
main_comparison = DataFrame(
"Report version" => comparison_version,
"Source" => [r[1] for r in comparison_rows],
"Central estimate" => [r[2] for r in comparison_rows],
"Lower 90%" => [r[3] for r in comparison_rows],
"Upper 90%" => [r[4] for r in comparison_rows],
);| Row | Report version | Source | Central estimate | Lower 90% | Upper 90% |
|---|---|---|---|---|---|
| String | String | Int64 | Int64 | Int64 | |
| 1 | 18 May | 18 May: McCabe Method 1 (Ituri, w=15 d) | 313 | 39 | 870 |
| 2 | 18 May | 18 May: McCabe Method 2 (τ=14 d, CFR 30%) | 501 | 402 | 612 |
| 3 | 18 May | 18 May: Our Method 2 reproduction | 493 | 411 | 584 |
| 4 | 18 May | 18 May: Our joint (report data) | 662 | 314 | 1697 |
| 5 | 20 May | 20 May: McCabe Method 1 (Ituri, w=15 d) | 313 | 39 | 870 |
| 6 | 20 May | 20 May: McCabe Method 2 (τ=14 d, CFR 33%) | 678 | 568 | 800 |
| 7 | 20 May | 20 May: Our Method 2 reproduction | 670 | 580 | 770 |
| 8 | 20 May | 20 May: Our joint (report data) | 904 | 446 | 2142 |
| 9 | current | Our joint (current data) | 921 | 438 | 2234 |
Joint posterior coverage of all 15 published McCabe et al. scenarios — for each scenario, the narrowest joint credible interval that contains it:
Joint coverage table
coverage_table = comparison_table(posterior_C_joint);| Row | Scenario | Reported cases | Narrowest interval |
|---|---|---|---|
| String | Int64 | String | |
| 1 | Method 1 Ituri, w=10 d | 470 | 90% |
| 2 | Method 1 Ituri, w=15 d | 313 | outside 90% |
| 3 | Method 1 Ituri, w=20 d | 235 | outside 90% |
| 4 | Method 1 +N. Kivu, w=10 | 617 | 90% |
| 5 | Method 1 +N. Kivu, w=15 | 412 | outside 90% |
| 6 | Method 1 +N. Kivu, w=20 | 309 | outside 90% |
| 7 | Method 2 τ=14 d, CFR 26% | 860 | 30% |
| 8 | Method 2 τ=14 d, CFR 33% | 678 | 60% |
| 9 | Method 2 τ=14 d, CFR 40% | 559 | 90% |
| 10 | Method 2 τ= 7 d, CFR 26% | 1386 | 90% |
| 11 | Method 2 τ= 7 d, CFR 33% | 1092 | 30% |
| 12 | Method 2 τ= 7 d, CFR 40% | 901 | 30% |
| 13 | Method 2 τ=21 d, CFR 26% | 730 | 60% |
| 14 | Method 2 τ=21 d, CFR 33% | 575 | 90% |
| 15 | Method 2 τ=21 d, CFR 40% | 474 | 90% |
The joint
Joint C_T density with published scenarios
imperial_density_fig = plot_cumulative_cases(
"joint (current data)" => posterior_C_joint,
"joint (18 May report)" => posterior_C_joint_report,
"joint (20 May report)" => posterior_C_joint_report_20may);McCabe et al. report sense check
Whether each reproduction lands on McCabe et al.'s reported Method 2 central estimate: the 18 May reproduction against their reported
Reproductions vs McCabe et al. Method 2
imperial_sense_check = let
rep = round(Int, quantile(posterior_C_imperial, 0.5))
lo = round(Int, imperial_C_credibles.lo90)
hi = round(Int, imperial_C_credibles.hi90)
delta = round(100 * (rep - 501) / 501; digits = 1)
rep2 = round(Int, quantile(posterior_C_imperial_20may, 0.5))
lo2 = round(Int, imperial_20may_C_credibles.lo90)
hi2 = round(Int, imperial_20may_C_credibles.hi90)
delta2 = round(100 * (rep2 - 678) / 678; digits = 1)
Markdown.parse("""
18 May reproduction: **$(rep) cases** (90% CrI $(lo)–$(hi)) against
McCabe et al.'s reported **501** — a difference of $(delta)%.
20 May reproduction: **$(rep2) cases** (90% CrI $(lo2)–$(hi2))
against McCabe et al.'s reported **678** — a difference of
$(delta2)%.
""")
end;18 May reproduction: 493 cases (90% CrI 411–584) against McCabe et al.'s reported 501 — a difference of -1.6%.
20 May reproduction: 670 cases (90% CrI 580–770) against McCabe et al.'s reported 678 — a difference of -1.2%.
Sense-check summary tables
imperial_summary = summary_table(chn_imperial,
[:m, :T, :cumulative_cases]; digits = 1);
imperial_summary_20may = summary_table(chn_imperial_20may,
[:m, :T, :cumulative_cases]; digits = 1);| Row | Quantity | Lower 90% | Lower 60% | Lower 30% | Upper 30% | Upper 60% | Upper 90% |
|---|---|---|---|---|---|---|---|
| String | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | m | 9.2 | 9.3 | 9.3 | 9.4 | 9.5 | 9.6 |
| 2 | T | 128.5 | 129.9 | 130.7 | 132.1 | 132.9 | 134.2 |
| 3 | cumulative_cases | 579.6 | 622.0 | 647.4 | 692.5 | 720.0 | 769.8 |
Saving results
The tables above are written to an output/ directory at the repo root so they can be archived and shared. On every push to main a GitHub Actions workflow regenerates these files and publishes them as a GitHub Release, downloadable from the repository's releases page (https://github.com/epiforecasts/BVDOutbreakSize/releases). The release bundles the four summary tables, a thinned set of posterior draws, and a copy of the input observations.toml so the exact data that produced each result is recorded alongside it.
Write outputs to output/
# Outputs default to `output/` in the package directory (where the
# docs build and Release workflow expect them). Set `BVD_OUTPUT_DIR`
# to redirect them, e.g. when running from a read-only package
# install.
output_dir = get(ENV, "BVD_OUTPUT_DIR",
joinpath(pkgdir(BVDOutbreakSize), "output"))
mkpath(output_dir)
CSV.write(joinpath(output_dir, "posterior_summary.csv"), joint_summary)
CSV.write(joinpath(output_dir, "cumulative_cases_by_stream.csv"),
streams_C_table)
CSV.write(joinpath(output_dir, "imperial_comparison.csv"), main_comparison)
CSV.write(joinpath(output_dir, "scenario_coverage.csv"), coverage_table)
CSV.write(joinpath(output_dir, "forecast_validation.csv"),
forecast_validation_table)
# Copy the input data so the release records what produced these
# results.
cp(joinpath(pkgdir(BVDOutbreakSize), "data", "observations.toml"),
joinpath(output_dir, "observations.toml"); force = true)
# Thinned posterior draws of the key joint parameters (every 10th
# draw) so downstream users can recompute their own summaries.
posterior_draws = DataFrame(
τ = vec(Array(chn_joint[:τ])),
r = vec(Array(chn_joint[:r])),
m = vec(Array(chn_joint[:m])),
T = vec(Array(chn_joint[:T])),
CFR = vec(Array(chn_joint[:CFR])),
p_drc = vec(Array(chn_joint[:p_drc])),
p_uganda = vec(Array(chn_joint[:p_uganda])),
cumulative_cases = vec(Array(chn_joint[:cumulative_cases])),
)[1:10:end, :]
CSV.write(joinpath(output_dir, "posterior_draws.csv"), posterior_draws);The full analysis code, data and model definitions are in the epiforecasts/BVDOutbreakSize repository. Issues, corrections and suggestions are welcome there. Maintained by Sam Abbott, Kath Sherratt, Samuel Brand and Sebastian Funk.