Strange behavior of traces from NUTS

I am trying to sample parameters in a system of ODEs using NUTS, and would see strange traces about half of the times as shown below.

What look strange to me is that, at some point, the trace would become almost flat. It happens with different NUTS parameters I tested: warmup=1000,1500; adapt_delta > 0.8; max_treedepth=10, 15.

See below for my Stan code. The priors are \mathcal{N} (1, 1) for all thetas. We only have values for y_4 at all time points for input but not the other three variables.

functions {
    real[] sho(real t, real[] y, real[] theta, real[] x_r, int[] x_i) {
        real dydt[4];
        real beta;
        real m_inf;

        dydt[1] = theta[1] * theta[2] * exp(-theta[3] * t) - theta[4] * y[1];
        dydt[2] = (theta[5] * y[1] * y[1])
            / (theta[6] * theta[6] + y[1] * y[1]) - theta[7] * y[2];
        dydt[3] = theta[8] * (y[4] + theta[9])
            * (theta[9] / (y[4] * theta[9]) - y[3]);
        beta = 1 + theta[10] * theta[11] / pow(theta[10] + y[4], 2);
        m_inf = y[2] * y[4] / ((theta[12] + y[2]) * (theta[13] + y[4]));
        dydt[4] = 1 / beta * (
                * (theta[15] * pow(m_inf, 3) * pow(y[3], 3) + theta[16])
                * (theta[18] - (1 + theta[14]) * y[4])
            - theta[17] * pow(y[4], 2) / (pow(theta[19], 2) + pow(y[4], 2))

        return dydt;
data {
    int<lower=1> N;        // number of variables
    int<lower=1> T;        // number of time steps
    real y0[N];            // initial values
    real y[T];             // values at all time points
    real t0;               // initial time point
    real ts[T];            // all time points
    real mu_prior[19];     // mean of prior
    real sigma_prior[19];  // standard deviation of prior
transformed data {
    real x_r[0];
    int x_i[0];
parameters {
    real<lower=0> sigma;
    real<lower=0> theta[19];
model {
    real y_hat[T, N];
    sigma ~ cauchy(0, 0.05);
    for (j in 1:19) {
        theta[j] ~ normal(mu_prior[j], sigma_prior[j]);
    y_hat = integrate_ode_rk45(sho, y0, t0, ts, theta, x_r, x_i);
    for (t in 1:T) {
        y[t] ~ normal(y_hat[t, 4], sigma);

I would appreciate it if anyone can provide some advice for diagnosing and fixing this issue. Thanks!

1 Like

Are there any warnings getting printed as this runs?

This could be the ODE solver is failing somewhere, or there is a bug in the specification of the ODE. Maybe use rstan’s expose_stan_functions function to try to debug the ODE function external to Stan. There might be a bug in it or something.

1 Like


I always get the following warnings when sampling finishes:

WARNING:pystan:583 of 1000 iterations ended with a divergence (58.3 %).
WARNING:pystan:Try running with adapt_delta larger than 1.0 to remove the divergences.

Number divergent iterations can range from 1 to over 600. According to documentation though, adapt_delta cannot be larger than 1.0.

I also got these lines below during sampling:

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: Max number of iterations exceeded (1000000).

Both types of messages would appear whether the traces look normal or like the example what I posted above. For that specific example though, I also get warnings about bad values of Rhat and E-BFMI, which do not appear in most runs.

Regarding debugging ODEs externally, I actually could recover our input data, y_4, using Scipy’s ODE solver with sampled thetas sometimes (1 out of every 4 chains roughly).

Sounds like the ODE is stiff. Try ode_integrate_bdf?

1 Like

You might also get trajectories marked divergent when the ODE solver fails. Not 100% sure on this, but I wouldn’t be surprised if an adapt_delta closer to 1.0 doesn’t work.