Parameter estimation for seasonal SEIR model with time-varying transmission rates

Dear all,
I’m new to Stan and I’m currently following the tutorial on “Bayesian workflow for disease transmission modeling in Stan” to develop an SEIR model for the transmission of hand, foot, and mouth disease (HFMD).
To begin, I generated simulated data using the deSolve package in R and then attempted to estimate the model parameters using the simulated data in Stan. However, I have encountered three issues during this process:

  1. The transmission rate of HFMD exhibits a seasonal pattern, and I have attempted to incorporate this using a cosine function in both R and Stan. I am uncertain whether I have implemented this correctly in Stan and would appreciate guidance on the proper approach.
  2. Despite obtaining a good fit for the “Infections” variable, I consistently observe negative values for the “Susceptibles” variable. The parameters I utilized to generate the simulated data in R did not result in negative “Susceptibles”. I am unsure of the cause behind this issue and would appreciate assistance in identifying a potential solution.
  3. Initially, I employed the negative binomial distribution for the case incidence, but I consistently encountered initialization failures. Consequently, I switched to using the Poisson distribution. I am uncertain about the principles governing the selection of priors for the parameter “phi” or “phi_inv” in a negative binomial distribution. Additionally, I am unsure whether it is appropriate to utilize the Poisson distribution in my particular situation.
    Could anyone please help me with the above issues? Thanks so much.

The following is the R code that I used to generate the simulated data and the Stan code for estimation.

library(deSolve)
library(tidyr)
library(dplyr)
library(rstan)
library(gridExtra)
rstan_options (auto_write = TRUE)
options (mc.cores = parallel::detectCores ())

seasonal.sir.model <- function (t, x, params) { 
    with(
        as.list(c(x, params)),
        {
            beta <- beta0*(1+beta1*cos(2*pi*t/180))
            dS <- mu*(N-S)-beta*S*I/N
            dI <- beta*S*I/N -(mu+gamma)*I
            dR <- gamma*I-mu*R
            dI_hat <- beta*S*I/N
            res <- c(dS,dI,dR,dI_hat)
            list(res) 
        }
    ) }

times <- seq(0,365*4,by=1)

params <- c(mu=1/700,N=8000000,beta0=0.25, 
            beta1=0.5, gamma=1/14)

xstart <- c(S=8000000-10,I=10,R=0,I_hat=10)

out <- as.data.frame(ode(xstart,times,seasonal.sir.model,params,rtol=1e-12,hmax=1/120))

cases=as.integer(lead(out$I_hat,n=1L)-out$I_hat) 

cases=cases[which(is.na(cases)==F)]

# times
n_days <- length(cases) 
t <- seq(0, n_days, by = 1)
t0 = 0 
t <- t[-1]

#initial conditions
N=8000000
i0 <- 10
s0 <- N-i0
r0 <- 0
y0 = c(S = s0,  I = i0, R = r0,I_hat=i0)

# data for Stan
data_seir <- list(n_days = n_days, y0 = y0, t0 = t0, ts = t, N = N, cases = cases)

stan model:


    real[] seir(real t, real[] y, real[] theta, 
               real[] x_r, int[] x_i) {

        real N = x_i[1];
        
        real beta0 = theta[1];
        real beta1 = theta[2];

        real S = y[1];
        real I = y[2];
        real R = y[3];

        real dS_dt = - beta0 *(1+ beta1 *cos(2*3.14*t/180)) * I * S / N  +  0.0014 * (N-S);
        real dI_dt =   beta0 *(1+ beta1 *cos(2*3.14*t/180)) * I * S / N  -  0.07 * I - 0.0014*I;
        real dR_dt =   0.07 * I - 0.0014 * R;
        real dI_hat=   beta0 *(1+ beta1 *cos(2*3.14*t/180)) * I * S / N;

        return {dS_dt, dI_dt, dR_dt, dI_hat};
    }
}

data {
  int<lower=1> n_days;
  real y0[4];
  real t0;
  real ts[n_days];
  int N;
  int cases[n_days];

}

transformed data {
  real x_r[0]; //  x_r being of length 0 because we have nothing to put in it
  int x_i[1] = { N }; 
}

parameters {
  real <lower=0> beta0;
  real <lower=0> beta1;

}

transformed parameters{
  real y[n_days, 4];
  real theta[2] = {beta0,beta1};
  real incidence [n_days-1];
  y = integrate_ode_rk45(seir, y0, t0, ts, theta, x_r, x_i);

 for (i in 1:(n_days-1)) {
     incidence[i]= y[i+1, 4] - y[i, 4] ;
   }
}

model {
  //priors
  beta0 ~ normal(0.25, 0.1);
  beta1 ~ normal(0.5, 0.1); 
  cases[1:(n_days-1)]  ~ poisson(incidence);
  

}

generated quantities {
  real pred_cases1[n_days-1];
  real ysir[n_days, 4]=y;
  pred_cases1 = poisson_rng(incidence);

}

1 Like

This is one for @charlesm93, who was one of the authors of the tutorial.

I am uncertain whether I have implemented this correctly in Stan and would appreciate guidance on the proper approach.

You can run your Stan code with a fixed parameter value and check that the Stan code returns the same output as the R code. In particular you’ll want to examine the transformed parameter y. If you’re using cmdstanr, I can point you to some features to do that.

Despite obtaining a good fit for the “Infections” variable, I consistently observe negative values for the “Susceptibles” variable.

This won’t fix the problem but we can constrain all the states to be positive:

real<lower = 0> y[n_days, 4];

What values did you get for y[:, 2]? Are they negative and close to 0 or do you get large negative values?

Initially, I employed the negative binomial distribution for the case incidence, but I consistently encountered initialization failures. Consequently, I switched to using the Poisson distribution.

How do you initialize your Markov chains? I strongly recommend not relying on Stan’s default initialization and instead specifying an initialization function in R. A good place to start is to initialize your chains using draws from your priors.

I am uncertain about the principles governing the selection of priors for the parameter “phi” or “phi_inv” in a negative binomial distribution.

This is discussed in the paper and we also conducted some prior predictive checks (Section 4.4). As for comparing the likelihoods, which ones works best depends on the insights you’re trying to get from your model. Is your goal to make good predictions? Good inference? Etc. But let’s first handle the other issues.

1 Like