Hi,
I’m trying to fit incidence data to a SIR model. To verify if the process works, I generated synthetic data from the following model where y is the incidence. Here, I assume the incidence is C_t - C_{t-1}
\frac{dS}{dt} = - \frac{\beta(t) S(t)I(t)}{N}
\frac{dI}{dt} = \frac{\beta(t) S(t)I(t)}{N} - \gamma I(t)
\frac{dR}{dt} = \gamma I(t)
\frac{d\beta}{dt} = \rho(\phi - \beta(t))
\frac{dC}{dt} = \frac{\beta(t) S(t)I(t)}{N}
\hat{y} = \Delta C
The values I try to fit are \rho, \phi, and the initial value of \beta(t), namely \beta_0. Their actual values are 0.1, 0.2, and 1.5, respectively.
I ran 8 chains for the calibration process. Seven of them estimate the actual values; whereas the other one gets stuck. This behaviour is similar to other more complex variants I’ve tried to fit. My intuition is that if I can solve it for this simple example, I would able to do so for the other ones.
Here is the Stan code:
functions {
real[] smooth_1(real time,
real[] y,
real[] params,
real[] x_r,
int[] x_i) {
real dydt[5];
real RR;
real population;
real nf_B;
real lambda;
real IR;
real CC;
RR = y[2]*0.1666666667;
population = y[2]+y[3]+y[1];
nf_B = params[1]*(params[2]-y[5]);
lambda = y[2]*y[5]/population;
IR = y[1]*lambda;
CC = IR;
dydt[1] = -IR;
dydt[2] = IR-RR;
dydt[3] = RR;
dydt[4] = CC;
dydt[5] = nf_B;
return dydt;
}
}
data {
int<lower = 1> n_obs;
int<lower = 1> n_difeq; // Number of differential equations in the system
int<lower = 1> n_params; // Number of model parameters
int y[n_obs];
real t0; // Initial time point (zero)
real ts[n_obs]; // Time points that were sampled
}
transformed data {
real x_r[0];
int x_i[0];
}
parameters {
real<lower = 0, upper = 1> rho;
real<lower = 0, upper = 1> phi;
real<lower = 0> B0;
}
transformed parameters{
real y_hat[n_obs, n_difeq]; // Output from the ODE solver
real y0[n_difeq]; // Initial conditions
real params[n_params];
real reported_cases[n_obs];
params[1] = rho;
params[2] = phi;
y0[1] = 999999;
y0[2] = 1;
y0[3] = 0;
y0[4] = 1;
y0[5] = B0;
y_hat = integrate_ode_rk45(smooth_1, y0, t0, ts, params, x_r, x_i);
reported_cases[1] = y_hat[1, 4] - y0[4];
for (i in 1:n_obs-1) {
reported_cases[i + 1] = y_hat[i + 1, 4] - y_hat[i, 4] + 0.000001;
}
}
model {
rho ~ normal(0.5, 0.5);
phi ~ normal(0.5, 0.5);
B0 ~ lognormal(log(1), 0.5);
y ~ poisson(reported_cases);
}
This is how I call the process:
stan_d <- list(n_obs = length(y_df$y),
n_params = length(constants),
n_difeq = length(stocks), # number of differential equations
y = y_df$y,
t0 = 0,
ts = 1:length(y_df$y))
# Test / debug the model:
test <- stan(filename, data = stan_d, chains = 1, iter = 10,
verbose = FALSE, refresh = 0)
# Fit and sample from the posterior
sf_smooth_1 <- stan(fit = test, data = stan_d, chains = 8,
warmup = 1000, iter = 2000, cores = 4,
seed = 90465, refresh = 5)
Here is the trace plot for the calibration parameters:
In this graph, we can see that the 8th chain gets stuck at \rho \sim 1. I also increased adapt_delta and change the likelihood from Poisson to negative binomial to no avail.
Consequently, my question is, how could I improve the model specification to avoid the above behaviour?
Thanks.