 # Investigating divergent chain

Hi, I am working on fitting a time series data into outputs of an ODE systems. The model is as following:

functions {
real[] ode(real t, real[] x, real[] theta, real[] x_star, int[] x_i) {
real dx_dt;
dx_dt = theta*x + theta*x - theta*x; // immune response
dx_dt = -theta*x + theta*x; // pathogen in lung
dx_dt = -theta*x + theta*x; // pathogen in the nose
return dx_dt;
}
}

data {
int nShedding; // num Shedding time points
int nNeutro; // num neotrophils/Igs time points
int nCFU; // num CFU time points

//time pointers
int dpiShedding[nShedding];
int dpiNeutro[nNeutro];
int dpiCFU[nCFU];
real CFUEndPoints;
int isFittingCFUEndPoints;

// data vectors
real neutroCounts[nNeutro];
real shedding[nShedding];
real CFU[nCFU, 2];

// ODE setup
real t0;
int odeEndpoint; // end point when the rabbit was sacrified
real tf[odeEndpoint-1]; // integration time, tf = 0.0
}

parameters {
real<lower = 0> theta;
real<lower = 0> x0; // ODE ICs
real<lower = 0> sigmaX; // error
real<lower = 0> sigmaY; // error CFULung and Nose
}

transformed parameters {
real x[odeEndpoint,3];
real log_x[odeEndpoint,3];
x[1,] = x0;
x[2:odeEndpoint,] = integrate_ode_rk45(ode, x0, t0, tf, theta,
rep_array(0.0, 0), rep_array(0, 0),
1e-4, 1e-5, 1e3);
// make sure that ODE solutions are positive
for (k in 1:3){
for (i in 1:odeEndpoint){
if(x[i, k] <= 1e-15){
log_x[i,k] = 0.0;
} else {
log_x[i,k] = log(x[i,k]+1.0);
}
}
}
}

model {

// priors
theta ~ normal(1, 0.5); // a_lung
theta ~ normal(1.4, 0.5); // b
theta ~ normal(0.5, 0.5); // c_Lung
theta ~ normal(0.5, 0.5); // r_Lung
theta ~ normal(1, 0.5); // c_Nose
theta ~ normal(0.15, 0.15); // r_Nose
theta ~ normal(0.15, 0.3); // a_Nose

sigmaX ~ lognormal(0, 1);
sigmaY ~ lognormal(0, 1);

x0 ~ lognormal(log(neutroCounts), .5);
x0 ~ lognormal(CFU[1,1]+0.2, 0.5);
x0 ~ lognormal(CFU[1,2]+0.1, 0.5);

// posteriors
for (k in 1:nNeutro){
neutroCounts[k] ~ lognormal(log_x[dpiNeutro[k],1], sigmaX);
}

if(isFittingCFUEndPoints == 1){
// only fit end points
CFUEndPoints ~ normal(log_x[dpiCFU[nCFU],2], sigmaY);
CFUEndPoints ~ normal(log_x[dpiCFU[nCFU],3], sigmaY);
} else {
// fit both CFU Nose and Lung
for (j in 1:2){
for (i in 1:nCFU){
CFU[i,j] ~ normal(log_x[dpiCFU[i],j+1], sigmaY[j]);
}
}
}

}

generated quantities {
real x_rep[nNeutro, 3];
for (m in 1:nNeutro){
x_rep[m,1] = lognormal_rng(log_x[dpiNeutro[m],1], sigmaX);
x_rep[m,2] = normal_rng(log_x[dpiNeutro[m],2], sigmaY);
x_rep[m,3] = normal_rng(log_x[dpiNeutro[m],3], sigmaY);
}
}

the time series are not very long, each has about 18-10 points. I set the number of iter = 15000 and warm-up = 10000. My question is, why is there such a big difference between the calculation between different chains? Some chains took forever to finish, I did some post-sampling investigation and found out that some chains have 20 divergences and some have 0. Can you suggest a way to investigate what happened to that particular chain that caused so many divergences? What is the best way to overcome this? Thank you :)

SAMPLING FOR MODEL ‘SingleRabbitSheddingNeutrophilsVer2B’ NOW (CHAIN 1).
Chain 1:
Chain 1: Gradient evaluation took 0.000642 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 6.42 seconds.
Chain 1:
Chain 1:
Chain 1: Iteration: 1 / 20000 [ 0%] (Warmup)

SAMPLING FOR MODEL ‘SingleRabbitSheddingNeutrophilsVer2B’ NOW (CHAIN 2).
Chain 2:
Chain 2: Gradient evaluation took 0.000898 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 8.98 seconds.
Chain 2:
Chain 2:
Chain 2: Iteration: 1 / 20000 [ 0%] (Warmup)

SAMPLING FOR MODEL ‘SingleRabbitSheddingNeutrophilsVer2B’ NOW (CHAIN 3).
Chain 3:
Chain 3: Gradient evaluation took 0.00249 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 24.9 seconds.
Chain 3:
Chain 3:
Chain 3: Iteration: 1 / 20000 [ 0%] (Warmup)

SAMPLING FOR MODEL ‘SingleRabbitSheddingNeutrophilsVer2B’ NOW (CHAIN 4).
Chain 4:
Chain 4: Gradient evaluation took 0.001521 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 15.21 seconds.
Chain 4:

Unfortunately, there are many possibilities. There are some things that look suspicious in your code, for example:

``````// make sure that ODE solutions are positive
for (k in 1:3){
for (i in 1:odeEndpoint){
if(x[i, k] <= 1e-15){
log_x[i,k] = 0.0;
} else {
log_x[i,k] = log(x[i,k]+1.0);
}
}
}

``````

This is problematic as assigning zero to `log_x` erases any gradient information and makes the posterior non-differentiable (as noted e.g. at https://mc-stan.org/docs/2_20/functions-reference/step-functions.html)

If you want to make sure the values are always positive, a good way might be to work on the log scale directly and transform with `exp` to strictly positive numbers.

There are also some general guidelines on understanding and debugging divergences, I wrote about those on my blog here: https://www.martinmodrak.cz/2018/02/19/taming-divergences-in-stan-models/
and here: https://www.martinmodrak.cz/2018/05/14/identifying-non-identifiability/

Also note that you can use triple backticks (```) for start and end of code blocks to make your post more readable.

Hope that helps!

1 Like