-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
275 lines (205 loc) · 9.57 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
---
output: github_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
The name of this R package, `reem`, stands for:
**R**enewal **E**quation based **E**pidemic **M**odel
This package simulates and fits infectious disease epidemic to
clinical and wastewater data.
To install: `devtools::install_github("phac-nml-phrsd/reem")`
**Note:** This package implements the model as a __Reference Class__. This is a way to access the Object Oriented Programming functionalities in R.
Developers, if you are not familiar with Reference Classes in R, please see this [very short introduction](https://www.datamentor.io/r-programming/reference-class) and [Hadley Wickham's](http://adv-r.had.co.nz/R5.html) course for more details.
## Model description
The epidemic model is a traditional SIR model but implemented as a
renewal equation instead of the more populate ordinary differential
equations (ODE).
It has been shown that the renewal equation implementation is
\emph{mathematically} equivalent to the ODE one.
See for example: D. Fargue, Reducibilite des systemes hereditaires, Int. J. Nonlinear Mech., 9 (1974), pp. 331--338, D. Breda et al On the formulation of epidemic models (an appraisal of Kermack and McKendrick), J. Biol. Dyn.,6 (2012), pp. 103--117, and Champredon et al. Equivalence of the Erlang-Distributed SEIR Epidemic Model and the Renewal Equation, SIAM J. Appl. Math., 78 (2018).
The renewal equation of the pathogen transmission process is as follows:
$$i(t) = \left(\frac{S(t-1)}{N}\right)^{\exp(\alpha)}\, \mathcal{R}_0 \, B(t) \sum_{k=1}^\ell g(k)i(t-k)$$
$$S(t) = \max(0, S(t-1) - i(t))$$
where $i(t)$ is the incidence at time $t$, $\mathcal{R}_0$ is the basic reproduction number, $S_t$ are the number of susceptible individuals at time $t$, $N$ is the total population size, $g$ is the intrinsic generation interval distribution and $B$ a function to change the transmission rate with respect to time (e.g., to implement public health measures, vaccination campaigns, etc.).
## Simulation example
The code below simulates an epidemic, tracking the spread in the population
and the pathogen concentration in the wastewater.
```{r libraries, warning=FALSE, message=FALSE}
library(reem)
library(snowfall)
library(lubridate)
library(dplyr)
library(tidyr)
library(ggplot2)
```
```{r define_model, warning=FALSE, echo=TRUE}
# Define model parameters
date.start = ymd('2022-01-01')
hz = 120
# Behavior change,
# time dependent multiplicative factor for transmission
B.date = date.start + c(-20:(hz+20)) # define beyond date.start +/- delta.start to avoid warnings
date.break = ymd('2022-03-01')
B = data.frame(date = B.date, mult = rep(1,length(B.date))) |>
mutate(mult = if_else(date >= date.break, 0.8, mult))
prms = list(
horizon = hz, # horizon of the simulation
last.obs = hz-1, # last observation time (must be < horizon)
B = B, #
i0prop = 1e-3, # initial proportion of the population infected
date.start = date.start, # start date of the epidemic
date.obs.cl = date.start + seq(7,hz-2, by = 7),
date.obs.ha = date.start + seq(12,hz-2, by = 10),
date.obs.ww = date.start + seq(3,hz-2, by=14),
# NOTE:
# `start.delta` (below) is used only during fitting
# (it is ignored for simple simulation).
# `start.delta` is created to conveniently fit the
# start date of an epidemic.
# If the user wants to change the date for a simulation,
# this must be done by updating `obj$prms$date.start` directly.
start.delta = 0,
R0 = 1.5, # Basic reproduction number
N = 1e4, # population size
alpha = 0.3, # transmission heterogeneity (alpha=0: homogeneous)
I.init = c(1,1,3,5), # initial incidence (overwritten in fit ABC)
lag = 7, # Aggregation window for clinical reports
rho = 0.1, # mean reporting ratio
g = get_gi(), # Generation interval distribution
fec = get_fecalshed(), # fecal shedding kinetics
h.prop = 0.05, # total proportion hospitalized for one cohort
h.lags = c(rep(0,3), 1, 2, 2, 1, 0), # Lag infection-hospitalization
kappa = 0.18, # pathogen decay in ww
psi = get_psi(), # plug flow simulation,
shed.mult = 0.2 # deposited fecal shedding multiplier
)
# Create the model object instance
obj = new('reem',
name = 'foo',
prms = prms,
is.fitted = FALSE)
# Print (formatted) the model parameters
obj$print_prms()
# Simulate an epidemic based on the parameters
simepi = obj$simulate_epi(deterministic = FALSE)
```
We can use the built-in function `plot_epi()` to plot the epidemic
```{r plot_epi, fig.width=12, fid.height=24, warning=FALSE}
g = plot_epi(simepi)
plot(patchwork::wrap_plots(g, ncol = 1))
```
We can also check the behavior change parameterization is working as expected:
```{r behavior check}
plot(g$populations + geom_vline(xintercept = date.break, linetype = 'dashed'))
```
## Fit example
If we attach observation data (from clinical and/or wastewater surveillance),
to a `reem` object, then it is possible to fit its model parameters to these data. The example below shows how to do this.
First, let's create synthetic observation data from the simulation example above. We use synthetic data (as opposed to real data) because we to assess the accuracy of the fit. The synthetic data are generated from the very same model `reem` where we know all the parameters.
```{r simulate data}
# we use the example above and make a copy of its
# simulated epidemics. The time series will be the "observations".
sim.data = simepi
# Set the date for "today"
asof = ymd('2022-03-01')
# Retrieve the simulated observations
obs.cl = dplyr::filter(sim.data$obs.cl, date <= asof)
obs.ww = dplyr::filter(simepi$obs.ww, date <= asof)
obs.ha = dplyr::filter(simepi$obs.ha, date <= asof)
# shift the dates such that they
# do not perfectly align with the simulation
obs.cl$t <- obs.cl$t + 1
obs.cl$date <- obs.cl$date + 1
# Attached (simulated) data to new `reem` object:
prms$t.obs.cl <- NULL
prms$t.obs.ha <- NULL
prms$t.obs.ww <- NULL
obj = new('reem',
name = 'foo2',
prms = prms,
obs.cl = obs.cl,
obs.ha = obs.ha,
obs.ww = obs.ww,
is.fitted = FALSE)
```
Now, we set up the fitting algorithm. This algorithm is an Approximate Bayesian Computation (ABC), which is a very straightforward and robust Bayesian method but not very efficient.
```{r define ABC parameters}
# Define the parameters for the ABC fitting algorithm
prm.abc = list(
n.abc = 500, # Total number of ABC iterations (the larger the better)
n.sim = 0, # Number of simulation for a given set of prior parameters. `0` for deterministic, else`8` should be enough
p.abc = 0.02, # Acceptance probability (the lower the better)
n.cores = 1, # number of cores used for parallel computing
use.cl = TRUE, # use clinical observations in the fit?
use.ha = TRUE, # use hospital admissions in the fit?
use.ww = TRUE, # use wastewater observation in the fit?
err.type = 'normlarge'# Type of error calculated during the fit.
)
# Define the priors of the parameters to be fitted:
prms.to.fit = list(
# Gamma distrib. with mean 2.5 and variance 0.25:
R0 = list('gamma', 2.5, 0.251),
# Other prior distributions definition self-explanatory:
alpha = list('norm', 2, 3),
i0prop = list('unif', -5, -2),
start.delta = list('unif_int', -7, 7) # `unif_int` = uniform with integers
)
```
Now that all the parameters for the ABC fit have been defined, we can launch the actual fit. Note that we use the object `obj` already built and we call its _member function_ named `fit_abc()`:
```{r fit ABC}
# Launch the fit
thefit = obj$fit_abc(prm.abc, prms.to.fit)
```
Once the ABC fit is completed, we can call built-in functions to assess the goodness of fit.
First, lets visualize the posterior time series trajectories for the the clinical cases and wastewater concentrations.
```{r plot_fit_results, fig.width=11}
gg = obj$plot_fit()
plot(gg$traj.cl)
plot(gg$traj.ha)
plot(gg$traj.ww)
```
We can also plot the posterior distributions of the fitted parameters (with their prior distribution overlayed in grey):
```{r plot posteriors, fig.width = 11}
plot(gg$post.prms)
```
This is plotting the pairwise 2D posterior densities, to check for any strong correlations between fitted parameters
```{r plot 2D posteriors, fig.width=11}
plot(gg$post.prms.2d)
```
And finally, we can assess the efficiency of the ABC algorithm by plotting the posterior distance values as a function of the sorted iterations
```{r plot fitted distances, fig.width=11}
plot(gg$dist)
plot(gg$dist.source)
```
## Forecasting
We can use this fitted object to forecast the epidemic trajectory.
```{r}
prm.fcst = list(
asof = asof,
horizon.fcst = ymd('2022-06-01'),
use.fit.post = TRUE,
n.resample = 20,
vars.to.fcst = c('Y', 'Wr', 'H'),
ci = seq(0.1,0.9, by = 0.1)
)
fcst = obj$forecast(prm.fcst = prm.fcst, verbose = 1)
g.fcst = obj$plot_forecast(date_breaks = '1 month')
g = patchwork::wrap_plots(g.fcst, nrow=1)
plot(g)
#saveRDS(obj, file = "obj_fitted.rds")
```
We can add a new hypothetical transmission scenario where transmission shoots up
shortly after asof date and then forecast
```{r scenario forecast}
B.date = date.start + c(-20:(hz+200))
date.break = ymd('2022-03-09')
B = data.frame(date = B.date, mult = rep(1,length(B.date))) |>
mutate(mult = if_else(date >= date.break, 10, mult))
# update
obj$prms[['B']] <- B
fcst = obj$forecast(prm = prm.fcst, verbose = 1)
g.fcst = obj$plot_forecast(date_breaks = '1 month')
g = patchwork::wrap_plots(g.fcst, nrow=1)
plot(g)
```