Diagnosing cause of ODE exception

Hello all,

I am trying to recreate someone else’s SEIR model in Stan for the purposes of verification and validation. As a first step, I want to generate test data from the model to put back into the original to see how well it recovers the parameter values. I have created SEIR ODE and populated (I think) the data. The parameters are being drawn from the priors in the generated quantities block. The model compiles but on running it generates the following exception and no output results:

Chain 1 Exception: ode_rk45: ode parameters and data[1] is nan, but must be finite! (in 'C:/Users/xconi/AppData/Local/Temp/Rtmp6nevLE/model-29825987421.stan', line 204, column 0 to line 218, column 6) 

If I understand the message, it is saying that the first parameter passed to the ODE function is NAN, but this should be from the real array ts that is passed to the ODE solver. This leads me to wonder a couple of things:

  1. How does one check to see if Stan is reading in the data correctly at runtime?
  2. What is the order of execution of things in the blocks, in particular the generated quantities block? Is it possible that the declaration of the ODE output vector y_hat coming before the population of some of the required quantities is causing a problem? I don’t think this should be so, but?
  3. Have I completely misunderstood something fundamental? PROBABLE CORRECT ANSWER

I am working with R 4.1.1 and CmdStan 2.27.0 via CmdStanR 0.4.0. The Stan model “prevalence_SEIR_pps_3.stan” and the calling R script are below.

I thank-you in advance for any help or insight you can give me.

Gregory Hunter


functions {
  int get_time_bin_index (
    real t,
    int n_bins,
    vector t_k
  ){
      
    int k;          // returned time bin index
    real sum_t_k;   // Current sum of times
    k = 1;
    sum_t_k = 0;
    for (i in 1:n_bins){
      sum_t_k += t_k[i];
      if (t <= sum_t_k) {
        k = i;
        break;
      };
    }
    return k;
  }
  
  real get_lambda_t (
    real t,         // time at which to evaluate lambda_t
    int n_bins,     // number of time bins
    vector t_k,     // vector of time bin *intervals* - sums to T, length n_bins
    real[] lambda_k,// vector of basic lambda values, length n_bins
    real lambda_0,  // lambda_0 - lambda at t == 0 
    real t_tr       // width of sigmoid 
  ){
    int k;          // time bin
    real s_k;       // sigmoid function - for clearness of code only
    real lambda_t;  // return value of function
    real lambda_kprev; // lambda_k-1
    
    k = get_time_bin_index(t, n_bins, t_k);
    
    s_k = 1 + exp(-4 * (t - t_k[k])/t_tr);
    
    if (k == 1) 
      lambda_kprev = lambda_0;
    else
      lambda_kprev = lambda_k[k-1];
    
    lambda_t = s_k * (lambda_k[k] - lambda_kprev) + lambda_k[k];
    
    return lambda_t;
  }
  
  vector ODE_func (
    real t,       // time
    vector y,     // system state {susceptible, exposed, infected, recovered}
    real alpha,   // infectiousness of asymptomatic infected
                  // [1/asymptomatic case/day] (relative to symptomatic)
    real gamma,   // Incubation rate [1/days]
    real mu,      // Recovery rate (?)
    real tau,     // symptomatic presentation rate [1/days]
    real p_asym,  // Proportion of cases that are asymptomatic
    real p_undet, // Proportion of symptomatic cases thare are undetected
    int n_bins,   // Number of time bins of lambda_k and t_k
    vector t_k,   // vector of time bin *intervals* - sums to T, length n_bins.
    real[] lambda_k, // vector of base lambda values, length n_bins
    real lambda_0,// lambda at time 0
    real t_tr,    // width of lambda transition sigmoid [days]
    real[] E_in   // Imported infections per day (from other regions)
    ) {
      vector[5] dy_dt;
      real S;       // Susceptible 
      real E;       // Exposed
      real IU;      // Infected, asymptomatic or undetected
      real IS;      // Infected, symptomatic
      real R;       // resolved + isolated 
      
      real N;       // Population = S + E + IA + IS + R
      real lambda_t;// spreading rate [1/days]
      real p_numdet;// Proportion of exposed that are symptomatic and detected

      // Extract system state
      S = y[1];
      E = y[2];
      IU = y[3];
      IS = y[4];
      R = y[5];

      // parameters
      lambda_t = get_lambda_t(t, n_bins, t_k, lambda_k, lambda_0, t_tr);

      // equations
      N = S + E + IU + IS + R;
      p_numdet = (1-p_asym)*(1-p_undet); // Proportion of exposed that are
      // symptomatic and detected
      
      // dS/dt 
      dy_dt[1] = -lambda_t * (alpha * IU + IS) * S / N;
      
      // dE/dt
      // dy_dt[2] = lambda_t * (alpha * IU + IS) * S / N - gamma * E + E_in;
      dy_dt[2] = lambda_t * (alpha * IU + IS) * S / N - gamma * E;
      
      // dIU/dt
      dy_dt[3] = gamma * E * (1-p_numdet) - mu * IU;
      
      // dIS/dt
      dy_dt[4] = gamma * E * p_numdet - tau * IS;
      
      // dR/dt
      dy_dt[5] = mu * IU + tau * IS;
      
      return dy_dt;
    }
}

//-----------------------------------------------------------------------------+
// Stan model proper starts here                                               |
//-----------------------------------------------------------------------------+

data {
  real rel_tol;     // control parameter for ode solver
  real abs_tol;     // control parameter for ode solver
  int max_iter;     // control parameter for ode solver
  int<lower = 1> n_obs;   // number of days observed
  int<lower = 1> n_bins;  // number of (approximately) 7 day time bins for 
                    // estimation of lambda_t from lambda_k
  // int<lower = 1> n_pop;   // population
  // int y[n_obs];  // data; total number of infected each day
  real t0;          // initial time point (zero)
  real ts[n_obs];   // time points observed

  vector[5] y_init;  // initial condition vector

  // data for priors
  real<lower = 0> log_mu_case_reporting_delay;
  real<lower = 0> log_sigma_case_reporting_delay;
  
  real<lower = 0> log_mu_weekly_intensity_variation;
  real<lower = 0> log_sigma_weekly_intensity_variation;
  
  real<lower = 0> mu_weekly_phase_variation;
  real<lower = 0> kappa_weekly_phase_variation;
}

generated quantities {

  // vector[2] y_hat; // solution from ODE solver

  real<lower=0> S[n_obs]; // Number of susceptibles over time
  real<lower = 0> E[n_obs];// Number of exposed over time
  real<lower = 0> IU[n_obs]; // Number of infected asymptomatics and undetected
  real<lower = 0> IS[n_obs]; // Number of infected symptomatics
  real<lower = 0> R[n_obs];// Number of resolved (recuperated, dead, isolated)
  
  real<lower = 0> E_in[n_obs];

  vector[n_obs] Cases_base;
  vector[n_obs] Cases_modulated;
  
  // priors
  
  simplex[n_bins] t_k_s;
  vector<lower = 0>[n_bins] t_k;
  vector[n_bins] alpha_k;
  real<lower = 0> t_tr = normal_rng(2, 0.5);
  real<lower = 0> lambda_0 = normal_rng(1, 0.3);
  real<lower = 0> lambda_k[n_bins];
  real<lower = 0, upper = 1> alpha = 0.5;
  real<lower = 0> gamma = lognormal_rng(log(1/5.4), 0.2);
  real<lower = 0> mu = lognormal_rng(log(1./10.), 0.2);
  real<lower = 0, upper = 1> tau = lognormal_rng(log(1./7.),0.2);
  real<lower = 0, upper = 1> p_asym = beta_rng(8, 4);
  real<lower = 0, upper = 1> p_undet = beta_rng(20, 5);
  real<lower = 0> weekly_variation_intensity = 
    lognormal_rng(log_mu_weekly_intensity_variation, 
                  log_sigma_weekly_intensity_variation);
  real phase_weekly = (3.5/pi()) * von_mises_rng(mu_weekly_phase_variation,
                                    kappa_weekly_phase_variation);
  int<lower = 0> case_reporting_delay = poisson_rng(2);


vector[5] y_hat[n_obs] = ode_rk45(ODE_func, y_init, t0, ts,
    alpha,   // infectiousness of asymptomatic infected
                  // [1/asymptomatic case/day] (relative to symptomatic)
    gamma,   // Incubation rate [1/days]
    mu,      // Recovery rate (?)
    tau,     // symptomatic presentation rate [1/days]
    p_asym,  // Proportion of cases that are asymptomatic
    p_undet, // Proportion of symptomatic cases thare are undetected
    n_bins,   // Number of time bins of lambda_k and t_k
    t_k,   // vector of time bin *intervals* - sums to T, length n_bins.
    lambda_k, // vector of base lambda values, length n_bins
    lambda_0,// lambda at time 0
    t_tr,    // width of lambda transition sigmoid [days]
    E_in   // Imported infections per day (from other regions)
    );

// Populate remaining input

  for (i in 1:n_obs){
    E_in[i]= lognormal_rng(log(2), log(2));
  }
  
  for (i in 1:n_bins) {
    alpha_k[i] = 10;
  };
  
  t_k_s = dirichlet_rng(alpha_k);
  t_k = n_obs * t_k_s;
  
  lambda_k[1] = normal_rng(lambda_0, 0.2);
  for (i in 2:n_bins){
    lambda_k[i] = normal_rng(lambda_k[i-1], 0.2);
  }

  for (i in 1:n_bins){
    lambda_k[i] = gamma_rng(64, 80);
  }

// Convert to observed cases

  for (t in 1:n_obs) {
    if (t > case_reporting_delay)
      Cases_base[t] = tau * IS[t - case_reporting_delay];
    else
      Cases_base[t] = 0;
    Cases_modulated[t]= Cases_base[t] * (1 - weekly_variation_intensity *
    fabs(sin(pi()*t/7) - 0.5 * phase_weekly) );
  }

}
}
****
```r
library(magrittr)
library(tidyverse)
library(here)
library(cmdstanr)

###############################################################################
# Prior predictive check                                                      #
###############################################################################

test_dat <- list(
  n_obs = 30, # number of days observed
  n_bins = ceiling(30/7), # Number of 7-day (ish) bins for lambda_k
  t0 = 1,     # initial time point (zero)
  ts = seq(from = 1, 
           to = 30, 
           by = 1),     # time points observed
  y_init = c(9994, 5, 1, 0 ,0),
  rel_tol = 1e-3,
  abs_tol = 1e-1,
  max_iter = 1e6,
  log_mu_case_reporting_delay = log(2), 
  log_sigma_case_reporting_delay = 0.2,
  log_mu_weekly_intensity_variation = log(1),
  log_sigma_weekly_intensity_variation = 2,
  mu_weekly_phase_variation = 0,
  kappa_weekly_phase_variation = 0.01
)

pps_file <- here('prevalence_SEIR_pps_3.stan')
pps_model <- cmdstan_model(pps_file)

pps <- pps_model$sample(data = test_dat,
                        chains = 1,
                        iter_warmup = 0,
                        iter_sampling = 10)

Could this be related to HOW I call the model? SHould I be using $sample or $generate_quantities? When Ii try the latter, it save that I need to pass fitted parameters, but I want to have they drawn in the model.

The easiest way to check is to use the print function to print the values of the parameters being passed to the function (or at other points of the model).

Additionally, you don’t have any parameters or priors in the model. When this is the case, you need to tell Stan that only the generated quantities will change at each iteration. To do this, you just need to add fixed_param=T to the $sample call.

Thanks for that. Both helped. I also realized that I was out of date in array declarations since Stan 2.26.

I have determined that I cannot declare the ODE in the generated quantities block with

  array[n_obs] vector[5]   y_hat = ode_bdf(ODE_func, y_init, t0, ts,
    alpha, ...

because the generated quantities block has not populated all of the necessary inputs. Instead , I have to call it afterwards.

 array[n_obs] vector[5]   y_hat;

...

  y_hat = ode_bdf(ODE_func, y_init, t0, ts,
    alpha, ...