Issue with Prior Distributions in ODE Model

Hello,

I am trying to run a PyStan model to perform parameter estimation of an SIR model. However, I am having some issues with specifying the prior distributions.

If I set the prior for z_init[1] (true initial susceptible population) as: z_init[1] ~ normal(10,1), then the model runs quite happily in about 50 seconds - but the parameter estimations are way off (as can be expected, as the initial susceptible population given by the data is 999, so my prior isn’t allowing for the true value).

However, if I change the prior to a more realistic z_init[1] ~ normal(999,1), then the model is almost unusable as even this simplified model takes over 6 hours to run. Does anyone have any idea why this is?

import pystan
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
import pandas as pd

model = """
functions{
real[] dz_dt(real t, real[] z, real[] theta, 
              real[] x_r, int[] x_i){
    real S = z[1];
    real I = z[2];
    real R = z[3];
        
    real alpha = theta[1];
    real beta = theta[2];
        
    real dS_dt = -alpha * S * I;
    real dI_dt = alpha * S * I - beta * I;
    real dR_dt = beta * I;
    return {dS_dt, dI_dt, dR_dt};
}
}

              
data{
     int<lower = 0> N;          //Number of measurements
     real ts[N];                //Measurement times > 0
     real y_init[3];            //Initial measured population
     real<lower = 0> y[N, 3];    //Measured population at measurement times
}

parameters{
    real<lower = 0> theta[2];   //theta = {alpha, beta}
    real<lower = 0> z_init[3];  //True initial population
    real<lower = 0> sigma[3];   //error scale
}

transformed parameters{
    real z[N, 3]
    = integrate_ode_rk45(dz_dt, z_init, 0.0, ts, theta,
                         rep_array(0.0,0), rep_array(0,0),
                         1e-6, 1e-5, 1e3);
}

model{
     theta[{1,2}] ~ normal(0.005, 1); //(alpha, beta)
     sigma ~ normal(1, 1);           //Error scale
     z_init[1] ~ normal(999, 1);         //True initial susceptible population
     z_init[2] ~ normal(10, 1);           //True initial infected population
     z_init[3] ~ normal(10, 1);           //True initial recovered population
     for (k in 1:3) {
             y_init[k] ~ normal(z_init[k], sigma[k]);
             y[ ,k] ~ normal(z[, k], sigma[k]);
     }
}
"""
sm = pystan.StanModel(model_code = model)

#############Data Creation######################################
num_meas = 200 #Number of measurements
N = num_meas - 1 #Number of measurements minus initial condition
t = np.arange(1,N+1)

alpha_true = 0.001
beta_true = 0.09

initial_inf = 1
pop = 1000

y0 = [pop - initial_inf ,initial_inf,0]

S,I, R = np.zeros(num_meas), np.zeros(num_meas), np.zeros(num_meas)
S[0], I[0], R[0] = y0

for i in range(N):
    S[i+1] = S[i] - alpha_true * S[i] * I[i]
    I[i+1] = I[i] + alpha_true * S[i] * I[i] - beta_true * I[i]
    R[i+1] = R[i] + beta_true * I[i]
    
for i in range(N):
    S[i] = max(0.0,  S[i]+np.random.normal(0,7)) + 0.001
    I[i] = max(0.0, I[i]+np.random.normal(0,10)) + 0.001
    R[i] = max(0.0, R[i]+np.random.normal(0,5)) + 0.001
    
data = np.array([S[1:num_meas],I[1:num_meas],R[1:num_meas]])
data_tr = data.transpose()

stan_data = {'N': N, 'ts':t, 'y_init': y0, 'y':data_tr}

fit = sm.sampling(data = stan_data, chains = 4, iter = 2000, n_jobs=1)

I get the following traceback:

(stan_env) C:\>python C:\Users\dms228\SIR2.py
INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_abf0fa570267eb1863c3ec1f
0f31c4c1 NOW.
C:/Users/dms228/AppData/Local/Continuum/anaconda3/envs/stan_env/lib/site-package
s/pystan/stan/lib/stan_math/lib/sundials_4.1.0/src/nvector/serial/nvector_serial
.c: In function 'N_VPrintFile_Serial':
C:/Users/dms228/AppData/Local/Continuum/anaconda3/envs/stan_env/lib/site-package
s/pystan/stan/lib/stan_math/lib/sundials_4.1.0/src/nvector/serial/nvector_serial
.c:289:13: warning: variable 'xd' set but not used [-Wunused-but-set-variable]
   realtype *xd;
             ^
C:/Users/dms228/AppData/Local/Continuum/anaconda3/envs/stan_env/lib/site-package
s/pystan/stan/lib/stan_math/lib/sundials_4.1.0/src/sundials/sundials_direct.c: I
n function 'PrintMat':
C:/Users/dms228/AppData/Local/Continuum/anaconda3/envs/stan_env/lib/site-package
s/pystan/stan/lib/stan_math/lib/sundials_4.1.0/src/sundials/sundials_direct.c:30
5:14: warning: variable 'a' set but not used [-Wunused-but-set-variable]
   realtype **a;
              ^
C:/Users/dms228/AppData/Local/Continuum/anaconda3/envs/stan_env/lib/site-package
s/pystan/stan/lib/stan_math/lib/sundials_4.1.0/src/sundials/sundials_sparse.c: I
n function 'SparsePrintMat':
C:/Users/dms228/AppData/Local/Continuum/anaconda3/envs/stan_env/lib/site-package
s/pystan/stan/lib/stan_math/lib/sundials_4.1.0/src/sundials/sundials_sparse.c:73
7:9: warning: variable 'indexname' set but not used [-Wunused-but-set-variable]
   char *indexname;
         ^
C:/Users/dms228/AppData/Local/Continuum/anaconda3/envs/stan_env/lib/site-package
s/pystan/stan/lib/stan_math/lib/sundials_4.1.0/src/sundials/sundials_sparse.c:73
6:9: warning: variable 'matrixtype' set but not used [-Wunused-but-set-variable]

   char *matrixtype;
         ^
C:/Users/dms228/AppData/Local/Continuum/anaconda3/envs/stan_env/lib/site-package
s/pystan/stan/lib/stan_math/lib/sundials_4.1.0/src/sundials/sundials_sparse.c:73
5:12: warning: variable 'NNZ' set but not used [-Wunused-but-set-variable]
   int i,j, NNZ;
            ^
C:/Users/dms228/AppData/Local/Continuum/anaconda3/envs/stan_env/lib/site-package
s/pystan/stan/lib/stan_math/lib/sundials_4.1.0/src/sunmatrix/sparse/sunmatrix_sp
arse.c: In function 'SUNSparseMatrix_Print':
C:/Users/dms228/AppData/Local/Continuum/anaconda3/envs/stan_env/lib/site-package
s/pystan/stan/lib/stan_math/lib/sundials_4.1.0/src/sunmatrix/sparse/sunmatrix_sp
arse.c:330:9: warning: variable 'indexname' set but not used [-Wunused-but-set-v
ariable]
   char *indexname;
         ^
C:/Users/dms228/AppData/Local/Continuum/anaconda3/envs/stan_env/lib/site-package
s/pystan/stan/lib/stan_math/lib/sundials_4.1.0/src/sunmatrix/sparse/sunmatrix_sp
arse.c:329:9: warning: variable 'matrixtype' set but not used [-Wunused-but-set-
variable]
   char *matrixtype;
         ^
Time taken to compile:  54.6 secs

Gradient evaluation took 0.001 seconds
1000 transitions using 10 leapfrog steps per transition would take 10 seconds.
Adjust your expectations accordingly!


Informational Message: The current Metropolis proposal is about to be rejected b
ecause of the following issue:
Exception: integrate_ode_rk45: initial state[1] is inf, but must be finite!  (in
 'unknown file name' at line 34)

If this warning occurs sporadically, such as for highly constrained variable typ
es like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-cond
itioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected b
ecause of the following issue:
Exception: integrate_ode_rk45: initial state[1] is inf, but must be finite!  (in
 'unknown file name' at line 34)

If this warning occurs sporadically, such as for highly constrained variable typ
es like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-cond
itioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected b
ecause of the following issue:
Exception: integrate_ode_rk45: initial state[2] is inf, but must be finite!  (in
 'unknown file name' at line 34)

If this warning occurs sporadically, such as for highly constrained variable typ
es like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-cond
itioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected b
ecause of the following issue:
Exception: integrate_ode_rk45: initial state[3] is inf, but must be finite!  (in
 'unknown file name' at line 34)

If this warning occurs sporadically, such as for highly constrained variable typ
es like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-cond
itioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected b
ecause of the following issue:
Exception: normal_lpdf: Location parameter[1] is nan, but must be finite!  (in '
unknown file name' at line 48)

If this warning occurs sporadically, such as for highly constrained variable typ
es like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-cond
itioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected b
ecause of the following issue:
Exception: Max number of iterations exceeded (1000).  (in 'unknown file name' at
 line 34)

If this warning occurs sporadically, such as for highly constrained variable typ
es like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-cond
itioned or misspecified.

Iteration:   1 / 600 [  0%]  (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected b
ecause of the following issue:
Exception: integrate_ode_rk45: initial state[1] is inf, but must be finite!  (in
 'unknown file name' at line 34)

If this warning occurs sporadically, such as for highly constrained variable typ
es like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-cond
itioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected b
ecause of the following issue:
Exception: integrate_ode_rk45: initial state[1] is inf, but must be finite!  (in
 'unknown file name' at line 34)

If this warning occurs sporadically, such as for highly constrained variable typ
es like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-cond
itioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected b
ecause of the following issue:
Exception: Max number of iterations exceeded (1000).  (in 'unknown file name' at
 line 34)

If this warning occurs sporadically, such as for highly constrained variable typ
es like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-cond
itioned or misspecified.

I suspect that it is because the initial values from which stan starts sampling are really far from 999 and your prior is very tight around it. It then is a long and slow road of doing bad ode solves during warmup. I suggest you do a reparametrization or give the starting point of sampling yourself so that for z_init[1] it is closer to 999

2 Likes

Thanks jtimonen, I will try to reparameterize