Initialization Error with ODE Model (SIR)


I am trying to perform parameter estimation on a simple SIR model, but for some reason I keep getting an initialisation error. I have checked my code multiple times but can’t see what the problem is. Any ideas?

Here is my code (with simulated data):

    import pystan
    import numpy as np
    import time
    import matplotlib.pyplot as plt
    from scipy.integrate import odeint

    model = """
    real[] dz_dt(real t, real[] z, real[] theta, 
                  real[] x_r, int[] x_i){
        real S = z[1];
        real I = z[2];
        real R = z[3];
        real alpha = theta[1];
        real beta = theta[2];
        real dS_dt = -alpha * S * I;
        real dI_dt = alpha * S * I - beta * I;
        real dR_dt = beta * I;
        return {dS_dt, dI_dt, dR_dt};

         int<lower = 0> N;          //Number of measurements
         real ts[N];                //Measurement times > 0
         real y_init[3];            //Initial measured population
         real<lower = 0> y[N, 3];    //Measured population at measurement times

        real<lower = 0> theta[2];   //theta = {alpha, beta}
        real<lower = 0> z_init[3];  //True initial population
        real<lower = 0> sigma[3];   //error scale

    transformed parameters{
        real z[N, 3]
        = integrate_ode_rk45(dz_dt, z_init, 0.0, ts, theta,
                             rep_array(0.0,0), rep_array(0,0),
                             1e-6, 1e-5, 1e3);

         theta[{1,2}] ~ normal(0.005, 1);
         sigma ~ lognormal(-1, 1);
         z_init ~ lognormal(log(10), 1);
         for (k in 1:3) {
                 y_init[k] ~ lognormal(log(z_init[k]), sigma[k]);
                 y[ ,k] ~ lognormal(log(z[, k]), sigma[k]);
    sm = pystan.StanModel(model_code = model)

    #############Data Creation######################################
    num_meas = 50 #Number of measurements
    N = num_meas - 1 #Number of measurements minus initial condition
    t = np.arange(1,N+1)

    alpha_true = 0.001
    beta_true = 0.09

    initial_inf = 1
    pop = 1000

    y0 = [pop - initial_inf ,initial_inf,0]

    S,I, R, tot = np.zeros(num_meas), np.zeros(num_meas), np.zeros(num_meas), np.zeros(num_meas)
    S[0], I[0], R[0] = y0

    for i in range(N):
        S[i+1] = S[i] - alpha_true * S[i] * I[i]
        I[i+1] = I[i] + alpha_true * S[i] * I[i] - beta_true * I[i]
        R[i+1] = R[i] + beta_true * I[i]
    for i in range(N): #Add some noise due to measurement error
        S[i] = max(0.0,  S[i]+np.random.normal(0,7))
        I[i] = max(0.0, I[i]+np.random.normal(0,10))
        R[i] = max(0.0, R[i]+np.random.normal(0,5))

    data = np.array([S[1:num_meas],I[1:num_meas],R[1:num_meas]])
    data_tr = data.transpose()

    stan_data = {'N': N, 'ts':t, 'y_init': y0, 'y':data_tr}
    fit = sm.sampling(data = stan_data, chains = 1, iter = 500, n_jobs=1)

I get the following error message when I try to run the final line:

fit = sm.sampling(data = stan_data, chains = 1, iter = 500, n_jobs=1)
Traceback (most recent call last):

File “”, line 2, in
fit = sm.sampling(data = stan_data, chains = 1, iter = 500, n_jobs=1)

File “C:\Users\dms228\AppData\Local\Continuum\anaconda3\envs\stan_env\lib\site-packages\pystan\”, line 955, in sampling
ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)

File “C:\Users\dms228\AppData\Local\Continuum\anaconda3\envs\stan_env\lib\site-packages\pystan\”, line 151, in _map_parallel
map_result = list(map(function, args))

File “AppData\Local\Temp\pystan_twed_zqi\stanfit4anon_model_ba0d6a770254e988d0c970c9137ffa1b_5906826466520084123.pyx”, line 373, in stanfit4anon_model_ba0d6a770254e988d0c970c9137ffa1b_5906826466520084123._call_sampler_star

File “AppData\Local\Temp\pystan_twed_zqi\stanfit4anon_model_ba0d6a770254e988d0c970c9137ffa1b_5906826466520084123.pyx”, line 406, in stanfit4anon_model_ba0d6a770254e988d0c970c9137ffa1b_5906826466520084123._call_sampler

RuntimeError: Initialization failed.

How are you running it? Stan is supposed to print more information about the error. Unfortunately that info always goes directly to “standard output stream”–if you’re working in an IDE it may be invisible. I recommend you save the whole script in a file and run it from command line, at least when there are initialization problems.

In this case the last lognormal complains that the location parameter is invalid. Adding a print statement tells me that some zs are negative, around -0.000001 . The exact solution to the differential equation is of course positive but the integrator is only approximate; when the solution goes very close to zero, small negative values are within the tolerance. It’s difficult to force the integrator to yield only positive values. The easiest fix is to just add a small constant correction to all z values.

Another problem is that some of the measured ys are exact zeros and that’s not allowed by lognormal.

The lognormal might not be the right distribution for a SIR model because it gives values near zero too much weight. A simple normal distribution (which is what your data simulation uses anyway) may be adequate and does not have strict positivity requirement. Alternatively, I’d consider poisson since this is count data.

1 Like

Just tried it. It runs now. Thanks so much!