SIR model has converging issues for initial values although lower bound is defined

ZM122 · August 25, 2022, 3:29pm

Dear Stan-community,

I have an ODE with some fixed parameters. I actually have to parametrize three parameters (beta and kappa, p_reported, below code) from the fitting by using real infectious disease data. I am getting errors in the initial value although I defined the lower bound for them.

Rejecting initial value:
Error evaluating the log probability at the initial value.
Exception: neg_binomial_2_lpmf: Location parameter[…] is -40725.6, but must be > 0!

Any hint will be appreciated.

PS. Please don’t recommend using distribution for these fixed parameters. It is tried and will not solve the problem. Also, I prefer to keep the number of unknown parameters low.

functions {
  real[] sir(real t, real[] y, real[] theta, 
             real[] x_r, int[] x_i) {

      int N = x_i[1];
      real beta = theta[1];
      real kappa =theta[2];
      real ia0 = theta[3];
      real ip0 = theta[4];
      real ic0 = theta[5];
      real e0 = theta[6];
      real a = x_r[1];
      real ca = x_r[2];
      real alpha= x_r[3];
      real init[6] = {N - e0- ia0 -ip0 -ic0, e0, ia0, ip0, ic0, 0}; // initial values

      real S = y[1] + init[1];
      real E = y[2] + init[2];
      real Ia = y[3] + init[3];
      real Ip = y[4] + init[4];
      real Ic = y[5] + init[5];
      real R = y[6] + init[6];
      
      real dS_dt = -beta * (ca*Ia + Ip + Ic) * S / N;
      real dE_dt =  beta * (ca*Ia + Ip + Ic) * S / N - a * E;
      real dIa_dt = alpha* a * E - ca* Ia + kappa *Ia;
      real dIp_dt = (1 -alpha)* a *E -  Ip;
      real dIc_dt =  Ip - Ic;
      real dR_dt = ca* Ia + Ic ;
      
      return {dS_dt, dE_dt, dIa_dt, dIp_dt, dIc_dt, dR_dt};
  }
}
data {
  int<lower=1> n_days;
  real t0;
  real ts[n_days];
  int N; 
  int cases[n_days];
}
transformed data {
  int x_i[1] = {N};
  real a = 0.30;
  real ca = 0.66;
  real alpha = 0.33;
  real x_r[3];
  x_r[1] = a;
  x_r[2] = ca;
  x_r[3] = alpha;
}
parameters {
  real<lower=0> beta;
  real<lower=0> kappa;
  real<lower=0> phi_inv;
  real<lower=0, upper=1> p_reported;
  real<lower=0> ia0; 
  real<lower=0> ip0;
  real<lower=0> ic0;
  real<lower=0> e0;
}
transformed parameters{
  real y[n_days, 6];
  real incidence[n_days - 1];
  real phi = 1. / phi_inv;
  real theta[6];
  theta = {beta,kappa,ia0,ip0,ic0,e0};
  y = integrate_ode_rk45(sir, rep_array(0.0, 6), t0, ts, theta, x_r, x_i);
  for (i in 1:n_days-1){
    //incidence[i] = -(y[i+1, 4] - y[i, 4]+ y[i+1, 3] - y[i, 3]+ y[i+1, 2] - y[i, 2] + y[i+1, 1] - y[i, 1])* p_reported; 
    incidence[i] = (y[i+1, 5] - y[i, 5]) * p_reported;
  }
}
model {
  //priors
  beta ~ normal(2, 0.5);
  phi_inv ~ exponential(3);
  p_reported ~ beta(1, 2);
  ia0 ~ normal(2, 0.1);
  ip0 ~ normal(2,0.1);
  ic0 ~ normal(0.4,0.5);
  e0 ~ normal(0.4,0.5);
  kappa ~ normal(10, 0.1);
  //sampling distribution
  //col(matrix x, int n) - The n-th column of matrix x. Here the number of infected people 
  cases[1:(n_days-1)] ~ neg_binomial_2(incidence, phi);
}
generated quantities {
  real pred_cases[n_days-1];
  pred_cases = neg_binomial_2_rng(incidence, phi); 
}

Bob_Carpenter · August 26, 2022, 5:14pm

The two location parameters you have for neg binomial both have the same parameters. The error message should say which one caused the problem, but here that’ll be the first density eval.

neg_binomial_2(incidence, phi);
neg_binomial_2_rng(incidence, phi);

For the names of the parameters, see the doc: 17.2 Negative binomial distribution (alternative parameterization) | Stan Functions Reference

incidence is defined as

incidence[i] = (y[i+1, 5] - y[i, 5]) * p_reported;

p_reported is constrained to (0, 1), which is good, but unless the y[ , 5] is sorted into ascending order, y[i + 1, 5] - y[i, 5] will be negative, and you’ll get the error message you reported.

So I would recommend declaring incidence as

vector<lower=0>[n_days - 1] incidence;

I suggest vector because I don’t know if you’re using RStan, which doesn’t support our new array syntax yet.

The usual thing to do in building a time series of positive-constrained parameters is to work on the log scale and take exp() to “link” everything back to a positive value.

ZM122 · August 29, 2022, 12:43pm

Thanks, @Bob_Carpenter.
Yes, I am using RStan, so defining a vector for “incidence” will not work in RStan. Is it true?

Sorry, I am new in Stan and did not get your last recommendation. Can you explain it more about which parameters and how? Thanks
( The usual thing to do in building a time series of positive-constrained parameters is to work on the log scale and take exp() to “link” everything back to a positive value.)

Topic		Replies	Views
Simple SIR model has converging issues for selection of initial values Modeling	8	737	August 23, 2022
Initialization error in multi-season SIR model Modeling	1	288	February 28, 2023
Initialization Error with ODE Model (SIR) Modeling	2	868	February 6, 2020
Error evaluating the log probability at the initial value Modeling	5	2279	December 14, 2019
Initialization between (-2, 2) failed after 100 attempts Modeling fitting-issues	3	604	January 30, 2021

SIR model has converging issues for initial values although lower bound is defined

Related topics