# using Random, Plots
# Random.seed!(42)
# include("../src/Simulations.jl")
# Y = simulateSimpleEpidemic()
# plot(plot(Y.Rt, ylabel="Reproduction number", label=false), plot(Y.It, ylabel="Infections", label=false), layout=(2,1))
Appendix B — Simulated data
It is useful to test our models on simulated data where we know what the true value of \(R_t\) is. Our simulations are broken into two stages:
- Simulating the underlying epidemic
- Simulating the observation process
Unsurprisingly, these correspond to the state-space model and observation model in the SMC algorithm.
B.1 Simulating the underlying epidemic
All our simulations leverage the Poisson renewal model:
\[ I_t \sim \text{Poisson}(R_t \Lambda_t) \]
where the force-of-infection is defined as:
\[ \Lambda_t = \sum_{u = 0}^{t-1} I_{t-u} g_u \]
for some generation time distriubtion \(g_u\).
Unless otherwise stated, we assume a Gaussian random walk for \(\log R_t\):
\[ \log R_t \sim \text{Normal}(\log R_{t-1}, \sigma) \]
where \(\sigma\) determines how quickly \(R_t\) varies. This matches the state-space model assumed by our default methods.
Simulations are initialised with \(I_1\) infections (default 10) and an initial value of \(R_1\) (default 2). We then iteratively sample \(R_t\) and \(I_t\) from the above model for \(t = 2, ..., T\) (default \(T = 100\)). The core script looks like:
# Initialise vectors
= zeros(T)
logRt 1] = log(R1)
logRt[= zeros(T)
It 1] = log(I1)
It[
for tt = 2:T
# Sample logRt from a normal distribution
= rand(Normal(logRt[tt], σ))
logRt[tt]
# Calculate the force of infection
= sum(It[tt-1:-1:1] .* g[1:tt-1])
Λt
# Sample the number of new infections
= rand(Poisson(exp(logRt[tt]) * Λt))
It[tt] end
# Combine into a single dataframe
= DataFrame(t=1:T, It=It, Rt=Rt) Y
Although additional code is included to ensure that total infections are between 100 and 100,000 (by default) and that \(R_t\) is between 0.2 and 5 (by default). The resulting function simulateSimpleEpidemic()
is provided in /src/Simulations.jl
. We call it here:
This is also the script that was used to generate “simulated_simple.csv”. We save it instead of calling it each time as the Random seed is not consistent between environments.
B.2 Simulating the observation process
If we are simulating the simple model then we set \(C_t = I_t\) (reported cases = infections) and we are done. Or we can simulate from any number of observation processes. We outline some here.
All functions created here are included in src/Simulations.jl
.
B.2.1 Binomial reporting
In the simple underreporting case, we might have: \[ C_t \sim \text{Binomial}(I_t, \rho) \]
for some underreporting fraction \(\rho\). We can implement this as:
using Distributions
# function simulateBinomialReporting(It, ρ)
# Ct = rand.(Binomial.(It, ρ))
# return(Ct)
# end
# Y.Ct = simulateBinomialReporting(Y.It, 0.5)
# plot(Y.It, ylabel="Infections", label="True infections", title="Binomial reporting example", size=(800,400), dpi=300)
# plot!(Y.Ct, label="Reported cases")
B.2.2 Delayed reporting
Or we might assume that cases are underreported and delayed: \[ C_t \sim \text{Binomial}\left(\sum_{u=0}^{u_{max}} I_{t-u} d_u, \ \rho\right) \]
If we assume \(\rho = 80\%\) of cases are reported, and the delay distribution is negative binomial with mean 5.6 and standard deviation 3.2, we can implement:
# function simulateBinomialUnderreportNegBinDelay(It, ρ, r, p)
# d = pdf.(NegativeBinomial(r, p), 0:length(It))
# DelayedIt = round.([sum(It[tt:-1:1] .* d[1:tt]) for tt = 1:length(It)])
# Ct = rand.(Binomial.(DelayedIt, ρ))
# return(Ct)
# end
# Y.Ct = simulateBinomialUnderreportNegBinDelay(Y.It, 0.8, 7, 5/9)
# plot(Y.It, ylabel="Infections", label="True infections", title="Binomial reporting and negative binomial delay", size=(800,400), dpi=300)
# plot!(Y.Ct, label="Reported cases")