Fitting a SIR with beta as function of time. Chain got stuck


I’m trying to fit incidence data to a SIR model. To verify if the process works, I generated synthetic data from the following model where y is the incidence. Here, I assume the incidence is C_t - C_{t-1}

\frac{dS}{dt} = - \frac{\beta(t) S(t)I(t)}{N}

\frac{dI}{dt} = \frac{\beta(t) S(t)I(t)}{N} - \gamma I(t)

\frac{dR}{dt} = \gamma I(t)

\frac{d\beta}{dt} = \rho(\phi - \beta(t))

\frac{dC}{dt} = \frac{\beta(t) S(t)I(t)}{N}

\hat{y} = \Delta C

The values I try to fit are \rho, \phi, and the initial value of \beta(t), namely \beta_0. Their actual values are 0.1, 0.2, and 1.5, respectively.

I ran 8 chains for the calibration process. Seven of them estimate the actual values; whereas the other one gets stuck. This behaviour is similar to other more complex variants I’ve tried to fit. My intuition is that if I can solve it for this simple example, I would able to do so for the other ones.

Here is the Stan code:

functions {
  real[] smooth_1(real time,
              real[] y,
              real[] params,
              real[] x_r,
              int[] x_i) {
  real dydt[5];
  real RR;
  real population;
  real nf_B;
  real lambda;
  real IR;
  real CC;
  RR = y[2]*0.1666666667;
  population = y[2]+y[3]+y[1];
  nf_B = params[1]*(params[2]-y[5]);
  lambda = y[2]*y[5]/population;
  IR = y[1]*lambda;
  CC = IR;
  dydt[1] = -IR;
  dydt[2] = IR-RR;
  dydt[3] = RR;
  dydt[4] = CC;
  dydt[5] = nf_B;
  return dydt;
data {
  int<lower = 1> n_obs;
  int<lower = 1> n_difeq; // Number of differential equations in the system
  int<lower = 1> n_params; // Number of model parameters
  int y[n_obs];
  real t0; // Initial time point (zero)
  real ts[n_obs]; // Time points that were sampled
transformed data {
  real x_r[0];
  int  x_i[0];
parameters {
  real<lower = 0, upper = 1> rho;
  real<lower = 0, upper = 1> phi;
  real<lower = 0> B0;
transformed parameters{
  real y_hat[n_obs, n_difeq]; // Output from the ODE solver
  real y0[n_difeq]; // Initial conditions
  real params[n_params];
  real reported_cases[n_obs];
  params[1] = rho;
  params[2] = phi;
  y0[1] = 999999;
  y0[2] = 1;
  y0[3] = 0;
  y0[4] = 1;
  y0[5] = B0;
  y_hat = integrate_ode_rk45(smooth_1, y0, t0, ts, params, x_r, x_i);
  reported_cases[1] = y_hat[1, 4]  - y0[4];
  for (i in 1:n_obs-1) {
    reported_cases[i + 1] = y_hat[i + 1, 4]  - y_hat[i, 4]  + 0.000001;
model {
  rho    ~ normal(0.5, 0.5);
  phi    ~ normal(0.5, 0.5);
  B0     ~ lognormal(log(1), 0.5);
  y      ~ poisson(reported_cases);

This is how I call the process:

stan_d <- list(n_obs     = length(y_df$y),
               n_params  = length(constants),
               n_difeq   = length(stocks), # number of differential equations
               y         = y_df$y,
               t0        = 0,
               ts        = 1:length(y_df$y))

# Test / debug the model:
test <- stan(filename, data = stan_d, chains = 1, iter = 10,
               verbose = FALSE, refresh = 0)

# Fit and sample from the posterior
sf_smooth_1 <- stan(fit = test, data = stan_d, chains = 8, 
                    warmup = 1000, iter = 2000, cores  = 4, 
                    seed = 90465, refresh = 5)

Here is the trace plot for the calibration parameters:

In this graph, we can see that the 8th chain gets stuck at \rho \sim 1. I also increased adapt_delta and change the likelihood from Poisson to negative binomial to no avail.

Consequently, my question is, how could I improve the model specification to avoid the above behaviour?


1 Like

I’m inclined to say this is a fundamental identifiability problem. Kind of a bummer that most chains to “the right place” and only one doesn’t, but this does point to multimodality in the posterior. One way to check this with the SIR model is to plug the modal values into the expression for R_0 and see if it yields similar values to the true data-generating one.

1 Like

Hi Max,

I followed your advice & estimated R0 from the posterior.

It seems that it is not multimodality.

Non-linear models like this can be very sensitive to the initials… so its a lot better to provide sensible initials and lower the default step size to 0.1 (or even 0.01).


Thanks for the answer. How do I change the default step size?

y_hat = integrate_ode_bdf(sho, y0, t0, ts, theta, x_r, x_i,
                          rel_tol, abs_tol, max_steps);

Is it the max_step arg?

You cannot change it, it will be automatically adjusted. You only could rescale the problem if that’s applicable. Another implicit way is to change the tolareances.

What is the issue with the automatic step size?

Max steps controls the maximal admissible number of steps between time points.

I think I got your first answer wrong.

so its a lot better to provide sensible initials and lower the default step size to 0.1 (or even 0.01)

Here, you meant step size of the HMC algorithm, not the ODE integrator, right?

1 Like


1 Like

Is this equivalent to setting adapt_delta to 0.9 and 0.99, respectively?

adapt delta closer to 1 will push down the stepsize during sampling. Setting the stepwise will control the initial stepsize (but it will be adapted).

1 Like

Hi Jandraor,

Although this topic has been marked as closed, I have a question that you might answer. What is in your data (y_df$y) fed into the model? Does it contain β or only reported cases?

I am asking this question because I need to model a system where not all variables are being recorded, that is equivalent to your solution where there is a parameter that varies with time if it’s totally unknown. Hopefully you are help me in that sense.

y_df$y only contains reported cases.

Thanks for your reply. This clear my misunderstanding that all data (y) in user defined function need to be supplied when calling stan/sampling function, but I should only need the initial value for the integrator to work and anything I want to fit to the model.