Moving to Incidence in Bayesian Workflow for Disease Transmission - Negative Values for Incidence Out of Solver

I am trying to attempt the model described in the Bayesian Workflow for Disease Transmission described here, but I keep getting errors thrown for logs of negative numbers/ zeros. I tried to set a lower bound, but that also gets errors. I know it must be coming from the outputs from the ODE and then passing them to the negative binomial, but I don’t know the right fix.

I am still using the basic boarding school data from the outbreak package and treating it as new cases per day rather than number of hospitalised children and all code from the examples save for the below.

I updated the transformed parameters as recommended and tried

transformed parameters{
  real y[n_days, 3];
  real phi = 1. / phi_inv;
  real<lower=0> incidence[n_days];
   //S(t) - S(t + 1)
  
  {
    real theta[2];
    theta[1] = beta;
    theta[2] = gamma;

    y = integrate_ode_rk45(sir, y0, t0, ts, theta, x_r, x_i);
  }
  
  for (i in 1:n_days-1) {
  incidence[i] = y[i, 1] - y[i + 1, 1];
  }
  
  
}

For the model block I did the following:

model {
  //priors
  beta ~ normal(2, 1);
  gamma ~ normal(0.4, 0.5);
  phi_inv ~ exponential(5);
  //sampling distribution  
  cases ~ neg_binomial_2(incidence, phi);
  
}

I know there must be a trick that I am missing.

Exception: validate transformed params: incidence[i_0__] is -1.23041e-07
Exception: validate transformed params: incidence[i_0__] is nan

Full Stan and R Code

// From https://mc-stan.org/users/documentation/case-studies/boarding_school_case_study.html
functions {
  real[] sir(real t, real[] y, real[] theta, 
             real[] x_r, int[] x_i) {

      real S = y[1];
      real I = y[2];
      real R = y[3];
      real N = x_i[1];
      
      real beta = theta[1];
      real gamma = theta[2];
      
      real dS_dt = -beta * I * S / N;
      real dI_dt =  beta * I * S / N - gamma * I;
      real dR_dt =  gamma * I;
      
      return {dS_dt, dI_dt, dR_dt};
  }
}
data {
  int<lower=1> n_days;
  real y0[3];
  real t0;
  real ts[n_days];
  int N;
  int cases[n_days];
}
transformed data {
  real x_r[0];
  int x_i[1] = { N };
  
}
parameters {
  real<lower=0> gamma;
  real<lower=0> beta;
  real<lower=0> phi_inv;
  
}
transformed parameters{
  real y[n_days, 3];
  real phi = 1. / phi_inv;
  real<lower=0> incidence[n_days];
   //S(t) - S(t + 1)
  
  {
    real theta[2];
    theta[1] = beta;
    theta[2] = gamma;

    y = integrate_ode_rk45(sir, y0, t0, ts, theta, x_r, x_i);
  }
  
  for (i in 1:n_days-1) {
  incidence[i] = y[i, 1] - y[i + 1, 1];
  }
  
  
}
model {
  //priors
  beta ~ normal(2, 1);
  gamma ~ normal(0.4, 0.5);
  phi_inv ~ exponential(5);
  //sampling distribution
  //col(matrix x, int n) - The n-th column of matrix x. Here the number of infected people 
  //cases ~ neg_binomial_2(col(to_matrix(y), 2), phi);
  
  cases ~ neg_binomial_2(incidence, phi);
  
}
generated quantities {
  real R0 = beta / gamma;
  real recovery_time = 1 / gamma;
  real pred_cases[n_days];
  
  //pred_cases = neg_binomial_2_rng(col(to_matrix(y), 2), phi);
  
}


library(rstan)
library(gridExtra)
library(outbreaks)
rstan_options (auto_write = TRUE)
options (mc.cores = parallel::detectCores ())
set.seed(336) # for reproductibility

# time series of cases
cases <- influenza_england_1978_school$in_bed  # Number of students in bed

# total count
N <- 763;

# times
n_days <- length(cases) 
t <- seq(0, n_days, by = 1)
t0 = 0 
t <- t[-1]

#initial conditions
i0 <- 1
s0 <- N - i0
r0 <- 0
y0 = c(S = s0, I = i0, R = r0)

# data for Stan
data_sir <- list(n_days = n_days, y0 = y0, 
								 t0 = t0, ts = t, N = N, cases = cases)

# number of MCMC steps
niter <- 2000

model <- stan_model("stan/sir.stan")

fit_sir_negbin <- sampling(model,
data = data_sir,
iter = niter,
 chains = 2)

Try the following changes and let us know whether it’s helped:

  • change for (i in 1:n_days-1) to for (i in 1: (n_days-1))
  • change real phi to real<lower=0> phi for rigor
1 Like

Hi @Elizaveta_Semenova thank you very much for your response. I made those changes, but still the same set of errors.

I think I pinpointed it to the incidence loop. The solver must return values that are negative or NA, so when calculating incidence I either pass negative values or Nas to neg_binomial_2.

  • I tried to add a check within the transformed parameters block that return either 1e-6 or the value after checking, but I still get issues.
  • I also added the compute_likelihood argument to try and sample from the prior only but that fails as well on the incidence parameter.
  • I tried to alter the incidence loop to count from 2 and set the first value of incidence to the first value for cases, but that resulted in the same error.

Something must be happening when calculating the incidence, but I don’t know how to see what’s happening.

// From https://mc-stan.org/users/documentation/case-studies/boarding_school_case_study.html
functions {
  real[] sir(real t, real[] y, real[] theta, 
             real[] x_r, int[] x_i) {

      real S = y[1];
      real I = y[2];
      real R = y[3];
      real N = x_i[1];
      
      real beta = theta[1];
      real gamma = theta[2];
      
      real dS_dt = -beta * I * S / N;
      real dI_dt =  beta * I * S / N - gamma * I;
      real dR_dt =  gamma * I;
      
      return {dS_dt, dI_dt, dR_dt};
  }
}
data {
  int<lower=1> n_days;
  real y0[3];
  real t0;
  real ts[n_days];
  int N;
  int cases[n_days];
  int compute_likelihood;
}
transformed data {
  real x_r[0];
  int x_i[1] = { N };
  
}
parameters {
  real<lower=0> gamma;
  real<lower=0> beta;
  real<lower=0> phi_inv;
  
}
transformed parameters{
  real y[n_days, 3];
  real<lower=0> phi = 1. / phi_inv;
  real<lower=0> incidence[n_days];
  real incidence_init;
  
   //S(t) - S(t + 1)
  //incidence[1] = cases[1];
  {
    real theta[2];
    theta[1] = beta;
    theta[2] = gamma;

    y = integrate_ode_rk45(sir, y0, t0, ts, theta, x_r, x_i);
  }
  
  //for (i in 2:(n_days-1)) {
  for (i in 1:(n_days-1)) {
  	
  	incidence_init = y[i, 1] - y[i + 1, 1];
  	// catch bad values 
  	if(incidence_init>=0){
  		incidence[i] = incidence_init;
  	} else {
  		incidence[i] = 0.0001;
  	}
  	
  
  }
  
}
model {
  //priors
  beta ~ normal(2, 1);
  gamma ~ normal(0.4, 0.5);
  phi_inv ~ exponential(5);
  //sampling distribution
  //col(matrix x, int n) - The n-th column of matrix x. Here the number of infected people 
  //cases ~ neg_binomial_2(col(to_matrix(y), 2), phi);
  if (compute_likelihood == 1)
  cases ~ neg_binomial_2(incidence, phi);
  
}
generated quantities {
  real R0 = beta / gamma;
  real recovery_time = 1 / gamma;
  real pred_cases[n_days];
  
  //pred_cases = neg_binomial_2_rng(col(to_matrix(y), 2), phi);
  
}


Errors:

Chain 2: Rejecting initial value:
Chain 2:   Error evaluating the log probability at the initial value.
Chain 2: Exception: validate transformed params: incidence[i_0__] is nan, but must be greater than or equal to 0  (in 'model1080a608eaa1d_sir' at line 44)

Chain 2: Initialization between (-2, 2) failed after 100 attempts. 
Chain 2:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.

1 Like

I thought too this was due to the ODE solver outputting negative y values too often, but in fact it’s an initialisation issue. You declare

real indidence[n_days]

and then assign

 for (i in 1:(n_days-1))
  	incidence[i] = y[i, 1] - y[i + 1, 1];

Thus you don’t initialise the last coefficient of incidence, which is nan by default. To solve your problem, you should declare incidence, cases and pred_cases to have size n_days - 1. When fitting the model, you should also change the data n_days fed to Stan to length(cases) + 1.

This will be clarified in the next version of the tutorial, thanks!

1 Like

There is a print statement in Stan (print(incidence);) which can be quite useful for debugging such issues.

@LeoGrin brilliant. I had just started down that path when I was just running the ODE that the final value was NaN. Thank you! Working properly now with the below! Thanks again @LeoGrin and @Elizaveta_Semenova

// From https://mc-stan.org/users/documentation/case-studies/boarding_school_case_study.html
functions {
  real[] sir(real t, real[] y, real[] theta, 
             real[] x_r, int[] x_i) {

      real S = y[1];
      real I = y[2];
      real R = y[3];
      real N = x_i[1];
      
      real beta = theta[1];
      real gamma = theta[2];
      
      real dS_dt = -beta * I * S / N;
      real dI_dt =  beta * I * S / N - gamma * I;
      real dR_dt =  gamma * I;
      
      return {dS_dt, dI_dt, dR_dt};
  }
}
data {
  int<lower=1> n_days;
  real y0[3];
  real t0;
  real ts[n_days];
  int N;
  int cases[n_days-1];
  int compute_likelihood;
}
transformed data {
  real x_r[0];
  int x_i[1] = { N };
  
}
parameters {
  real<lower=0> gamma;
  real<lower=0> beta;
  real<lower=0> phi_inv;
  
}
transformed parameters{
  real y[n_days, 3];
  real<lower=0> phi = 1. / phi_inv;
  real incidence[n_days-1];
  
   //S(t) - S(t + 1)
  //incidence[1] = cases[1];
  {
    real theta[2];
    theta[1] = beta;
    theta[2] = gamma;

    y = integrate_ode_rk45(sir, y0, t0, ts, theta, x_r, x_i);
  }
  
  //for (i in 2:(n_days-1)) {
   for (i in 1:(n_days-1))
  	incidence[i] = y[i, 1] - y[i + 1, 1];
 
  
}
model {
  //priors
  beta ~ normal(2, 1);
  gamma ~ normal(0.4, 0.5);
  phi_inv ~ exponential(5);
  //sampling distribution
  //col(matrix x, int n) - The n-th column of matrix x. Here the number of infected people 
  //cases ~ neg_binomial_2(col(to_matrix(y), 2), phi);
  //if (compute_likelihood == 1)
  cases ~ neg_binomial_2(incidence, phi);
  
}
generated quantities {
  real R0 = beta / gamma;
  real recovery_time = 1 / gamma;
  real pred_cases[n_days-1];
  
  pred_cases = neg_binomial_2_rng(incidence, phi);
  
}

2 Likes