\documentclass{article} \usepackage{verbatim,sober} \usepackage{Sweave} \title{Continuous-time stochastic simulation of epidemics in R} \author{Ben Bolker} \date{\today} \begin{document} \maketitle \section{Introduction/basic code} Using the \emph{Gillespie algorithm}, which assumes that all the possible events that can occur (death of an individual, birth, infection, etc.) occur \emph{independently} and (given the current state of the system --- number of infectives, susceptibles, etc.) \emph{with constant probability per unit time}. Given that this is true, we can pick an exponential deviate (using R's {\tt rexp()} function) that tells us how long it is until the next event occurs, and then pick from among the possible transitions with probabilities proportional to their individual rates ({\tt sample()}). Set the random-number seed so that results are reproducible every time we run this code (you can really use any integer you want here). <<>>= set.seed(1001) @ An R function that implements the Gillespie algorithm. The user must specify \begin{itemize} \item{{\tt start}: starting values} \item{{\tt ratefun}: a function of the current state variables, parameters, and time that returns a numeric vector of the rates (probabilities per unit time) at which each kind of event is occurring} \item{{\tt trans}: a matrix indicating the changes in each state variable (column) that occur when a particular event (row) takes place} \item{{\tt pars}: a (named) numeric vector of parametesr} \item{{\tt times}: a vector of times at which to report output} \end{itemize} <>= gillesp <- function(start,ratefun,trans,pars, times=0:20) { t0 <- times[1] ## set time to starting time ntimes <- length(times) X <- start ## set state to starting state res <- matrix(nrow=length(times),ncol=length(start), ## matrix for results dimnames=list(times,names(start))) for (ctr in 1:(ntimes-1)) { ## loop over reporting times res[ctr,] <- X ## record current state while (t0>= ratefun.SIR <- function(X,pars,time) { vals <- c(as.list(pars),as.list(X)) ## attach state and pars as lists rates <- with(vals, ## black magic to allow ## reference to states and ## parameters by name c(infection=beta*S*I, death=alpha*I, recovery=gamma*I)) } @ \verbatiminput{gillespie.2.R} Defining names for the state variables and transitions isn't absolutely necessary, but will make your code \emph{much} easier to read later on:n <<>>= statenames.SIR <- c("S","I","R") ## state variable names transnames.SIR <- c("infection","death","recovery") @ Define the matrix of transitions (not the same as the transition matrix of a Markov chain) and the function : <>= trans.SIR <- matrix(c(-1,1,0, 0,-1,0, 0,-1,1), byrow=TRUE, ## default is by column ncol=3, ## number of columns = number ## of state variables dimnames=list(transnames.SIR,statenames.SIR)) @ \verbatiminput{gillespie.3.R} The transition matrix ends up looking like this: <<>>= trans.SIR @ Define parameters (numeric vector with names): <<>>= pars.SIR <- c(beta=0.1,alpha=1,gamma=1) @ ($R_0$ in this case is equal to $\beta N/(\alpha+\gamma)$; if $N=100$ then $R_0=5$). Run stochastic simulation: <<>>= G.SIR <- gillesp(start=c(S=97,I=3,R=0),times=seq(0,5,by=0.05), ratefun=ratefun.SIR,trans=trans.SIR,pars=pars.SIR) @ Plot it ({\tt matplot} plots the columns of a matrix as separate lines against a single $x$ variable. {\tt type="l"} specifies lines, {\tt lty=1} specifies solid lines, {\tt xlab} and {\tt ylab} specify $x$- and $y$-axis labels. {\tt G.SIR[,"times"]} picks out the column labeled {\tt times}, {\tt G.SIR[,-1]} picks out all but the first column). <>= matplot(x=G.SIR[,"times"],y=G.SIR[,-1],type="l",lty=1, xlab="Time",ylab="Number") @ Fancy: use {\tt replicate} run 100 stochastic simulations, using {\tt [,"I"]} to save just the number of infectives \ldots <<>>= G.SIR.mult <- replicate(100, gillesp(start=c(S=100,I=3,R=0),times=seq(0,5,by=0.05), ratefun=ratefun.SIR,trans=trans.SIR,pars=pars.SIR)[,"I"]) @ Plot it, adding a line for the mean ({\tt lines} adds a line to the existing plot, {\tt lwd=2} sets the line width to 2): <>= matplot(G.SIR[,"times"],G.SIR.mult,type="l",col="gray",lty=1, xlab="Time",ylab="Number infective") lines(G.SIR[,"times"],rowMeans(G.SIR.mult),lwd=2) @ \section{SIR with vital dynamics (constant population size)} <<>>= ratefun.vSIR <- function(X,pars,time) { vals <- c(as.list(pars),as.list(X)) rates <- with(vals, c(birth=mu*K,infection=beta*S*I, Sdeath=(mu*S), Ideath=(alpha+mu)*I, Rdeath=(mu*R),recovery=gamma*I)) } statenames.vSIR <- statenames.SIR transnames.vSIR <- c("birth","infection","Sdeath","Ideath","Rdeath","recovery") trans.vSIR <- matrix(c(1,0,0, -1,1,0, -1,0,0, 0,-1,0, 0,0,-1, 0,-1,1),byrow=TRUE,ncol=3, dimnames=list(transnames.vSIR,statenames.vSIR)) @ Here I'm going to make $\beta$ quite a bit larger ($R_0=\beta N/(\alpha+\gamma+\mu) = 150/2.2=68$) (!), in order to avoid having the epidemic go extinct during the crash after the first epidemic peak. There is a general epidemiological puzzle here about how new epidemics get started from low numbers without going extinct in the first epidemic trough \ldots <<>>= pars.vSIR <- c(beta=1.5,alpha=1,gamma=1,mu=0.2,K=100) @ Run it: <<>>= G.vSIR <- gillesp(start=c(S=100,I=3,R=0),times=seq(0,50,by=0.05), ratefun=ratefun.vSIR,trans=trans.vSIR,pars=pars.vSIR) @ Plot it: <>= matplot(G.vSIR[,"times"],G.vSIR[,-1],type="l",lty=1, xlab="Time",ylab="Number") @ \section{Extensions} \begin{itemize} \item{vertical transmission: split {\tt birth} into {\tt Sbirth} and {\tt Ibirth}} \item{vaccination: e.g. <>= rates <- c(Sbirth=ifelse(time>= beta.seas <- beta.0*(1+beta.ampl*cos(2*pi*time)) @ or step-function: <>= beta <- ifelse(time %% 1 < seas.frac, beta.0,0) @ } \item{multiple classes of individuals with differential mixing/susceptibility/etc. (S1,I1,R1,S2,I2,R2, etc., although this can get tedious)} \end{itemize} \section{Other challenges} \begin{itemize} \item{Write {\tt odesolve}/{\tt lsoda} code to check against these results \ldots} \end{itemize} \end{document}