Binomial_lpmf: Probability parameter[1] is 1, but must be in the interval [0, 1]

Unless I am misunderstanding how Stan works, I was under the impression that cases[i] is actually a series of integers (length agrps) , as is n[i]? This is how it is defined in R. It is also defined in Stan as:

int<lower=0>cases[agrps];
int<lower=0>n[agrps];

I have now changed the likelihood function to sample estimates from comp_pI at the max value of time (t)

target += binomial_lpmf(cases[i] | n[i], comp_pI[t,i]);

This runs and gives reasonably model output when printed but for fitting throws the error:

There were 5 divergent transitions after warmup

int<lower=0>cases[agrps];
int<lower=0>n[agrps];

cases and n are both length agrps arrays of integers. cases[i] and n[i] are the ith entry in each array (so they’re both integers)

target += binomial_lpmf(cases[i] | n[i], comp_pI[t,i]);

Yeah that looks reasonable.

With a few divergences, try bumping up adapt delta to like 0.95 and see if they go away. It means the sampler will use a smaller timestep in the internal integrator which will be slower but can make these divergences disappear.

Divergences mean there’s something a bit numerically difficult about the posterior. If they go away with a little adapt delta it’s not worth worrying about, but if they don’t disappear then there’s something going on that needs investigated.

cases and n are both length agrps arrays of integers. cases[i] and n[i] are the ith entry in each array (so they’re both integers)

OK thanks, as I thought. The likelihood function samples comp_pI at corresponding indexes too. So that looks all fine?

With a few divergences, try bumping up adapt delta to like 0.95 and see if they go away. It means the sampler will use a smaller timestep in the internal integrator which will be slower but can make these divergences disappear.

Yeah so I have already increased adapt delta but still getting divergences errors. Could this just be due to the fact that the model function & parameters don’t do a good job of estimating the data? This an over-simplified version of the full R model that I’m gradually translating into Stan.

Thanks again, Ben.

Yeah sometimes it can be that a more complicated model samples better than a simpler one. It would be a fair thing to try the more complicated model.

You could also try tightening up the prior on lambda0.

Thanks Ben. I went for this strategy: I’ve slowly introduced more complexity into the model and it’s working well, with an exception.

My x_r array needs to include real numbers of different lengths (some 1, others 400). I’ve had little success with just listing them as an array in the transformed data block, like x_r[n] = {variable1, variable2, variable3}. Is there a way Stan can accept an x_r array containing variables of different lengths?

There is the option of creating a different variable for each of the 400 elements of the array, e.g.:

Instead of real x[400] I could define real x1, real x2, real x3...real x400. But ideally wouldn’t want to do this. Is there a more elegant solution?

Cheers.

Good to hear!

Not yet. It’s something we want, but it’s not in the language yet.

The way to do this is if you have arrays of different length, concatenate them all into one big array, and then have another array of start indices. For instance:

real big_array[N]; // N total values
int starts[S + 1]; // Start index of S arrays (and the last index is N + 1)

big_array[starts[1]:(starts[2] - 1)] // first array
big_array[starts[s]:(starts[s + 1] - 1)] // sth array

More doc is here: https://mc-stan.org/docs/2_24/stan-users-guide/ragged-data-structs-section.html

Ah amazing, that’s just the trick. Just to clarify, is this all done within the data block?

And would this all be defined within the same block, e.g. within a for loop, and then big_array could be saved to x_r? Apologies if daft question.

Yeah usually you’d build the big_array and starts array outside in something like R or Python (where you are allowed to have ragged arrays – which is the name for arrays of arrays with different lengths) and pass it in through the data block.

Ooo, you’re using this in the ODE solver. We recently changed the interface on the ODE solver stuff to make it easier to pass things in (so you don’t have to pack everything into x_r and x_i any more).

There’s some info on that here: https://mc-stan.org/users/documentation/case-studies/convert_odes.html

That isn’t available in rstan yet, but it is available in cmdstanr: https://mc-stan.org/cmdstanr/index.html

That’s very useful to know, thank you. Unfortunately, in translating my model to the new ODE interface, I have encountered an issue:

The new ode_rk45_tol function isn’t recognising my ODE function, which is now defined as

functions{
  vector mod_si(
    real t, vector y, ...){

and called as

  y = ode_rk45_tol(
    mod_si,      //model fucntion
    init,    //vector initial values
    t0,      //real initial time
    ts,      //real[] times
    1.0E-10, //real rel_tol 
    1.0E-10, //real abs_tol 
    1.0E3, //int max_num_steps
    lambda0,  //pass parameters directly into model
    agrps, K, //pass int data directly into model
    r, da, mctr, d, propfert //pass real data directly into model
    );  

When I compile the model I get the following error message:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
Variable “mod_si” does not exist.
error in ‘model6bc72ba6fdfb_stan_mod_simple’ at line 180, column 10

Huh strange. Can you post the model and I’ll try to get it to compile?

Thanks Ben. The error message is only thrown when I click ‘Check’ in R studio in the stan file. When I compile (using cmdstan_model function) & fit the model in my R script, I get reasonable looking output except some nan values. Specifically, while I should be getting vectors of length 400 returned and filled with values, I get 181 elements of nan, followed by 219 elements containing (reasonable)numbers. I’m unsure why this is happening. The full Stan model code is below. Thanks again!

functions{
  vector mod_si(
    real t, vector y,  //y is a vector in new ODE interface 
    real lambda0,  //pass parameters directly into model
    // real lambda1,
    //real gradient, real shape, 
    int agrps, int K, //pass int data directly into model
    // vector age, 
    real r, real da, real[] mctr, real[] d, real[] propfert //pass real data directly into model
    ) {
      
      //define variables to calculate within model 
      real deaths[agrps];
      vector[agrps] Na;
      real births_age[agrps];
      real births;
      real dprev[agrps];
      real seroconv1[agrps];
      real seroconv2[agrps];
      real seroconv3[agrps];
      real c1[agrps];
      real c2[agrps];
      real c3[agrps];
      real ct1[agrps];
      real ct2[agrps];
      real ct3[agrps];
      real matAb1[agrps];
      real matAb2[agrps];
      real matAb3[agrps];
      vector[agrps] pI;
      real matAbt;
      real ctt;
      
      //define derivative length
      vector[(agrps*K)] dydt;
      
      // define states
      vector[agrps] S = y[1:agrps];
      vector[agrps] I = y[(agrps+1):(2*agrps)];
      vector[agrps] Im = y[((2*agrps)+1):(3*agrps)];
      
      //define foi
      real foi[agrps];
      foi = rep_array(lambda0, agrps);
      // for(i in 1:agrps){
        //foi[i] = (lambda0 + lambda1*(age[i]^2) * (age[i] * exp(-gradient*age[i])))*shape;
      //   foi[i] = lambda0 + lambda1*(pow(age[i], 2));
      // }
      
      //total modelled population size
      for(i in 1:agrps){
        Na[i] = S[i] + I[i] + Im[i];
      }
      
      //age-specific total deaths
      for(i in 1:agrps){
        deaths[i] = d[i] * Na[i];
      }
      
      //age-specific total births
      for(i in 1:agrps){
        births_age[i] = deaths[i] * propfert[i];
      }
      
      //total births (all ages)
      births = sum(births_age);
      
      // conception distribution
      //move age back 3, 6 or 9 mo to calculate conception distribution for 3 trimesters
      //e.g. c3 = conceived ~9 months ago (more accurately, 7.5 months ago)
      for(i in 1:(agrps-3)){
        c1[i] = births_age[i+1];
        c2[i] = births_age[i+2];
        c3[i] = births_age[i+3];
      }
      
      //seroprevalence
      for(i in 1:agrps){
        pI[i] = (I[i] + Im[i])/Na[i];
      }
      
      //calculating seroconversions in pregnancy and cases of congenital disease
      for(i in 1:(agrps-3)){
        if(i==1){
          dprev[i] = 0;
          seroconv1[i] = 0;
          seroconv2[i] = 0;
          seroconv3[i] = 0;
          ct1[i] = 0;
          ct2[i] = 0;
          ct3[i] = 0;
          matAb1[i] = 0;
          matAb2[i] = 0;
          matAb3[i] = 0;
          
        } else {
          dprev[i] = pI[i]-pI[i-1];                //change in prevalence (must be positive)
          seroconv1[i] = dprev[i]*c1[i];           //pregnant women seroconverting in trimester 1
          seroconv2[i] = dprev[i]*c2[i];           //pregnant women seroconverting in trimester 2
          seroconv3[i] = dprev[i]*c3[i];           //pregnant women seroconverting in trimester 3
          ct1[i+3] = seroconv1[i]*mctr[1];         //likelihood of transmission trimester 1
          ct2[i+2] = seroconv2[i]*mctr[2];         //likelihood of transmission trimester 2
          ct3[i+1] = seroconv3[i]*mctr[3];         //likelihood of transmission trimester 3
          matAb1[i+3] = seroconv1[i]*(1-mctr[1]);  //maternal Ab trimester 1
          matAb2[i+2] = seroconv2[i]*(1-mctr[2]);  //maternal Ab trimester 2
          matAb3[i+1] = seroconv3[i]*(1-mctr[3]);  //maternal Ab trimester 3
        }
      }
      
      //total number of antibody positive and congenitally diseased births
      matAbt = sum(matAb1) + sum(matAb2) + sum(matAb3);
      ctt = sum(ct1) + sum(ct2) + sum(ct3);
      
      //model ODEs
      for(i in 1:agrps){
        if(i==1){
          //S
          dydt[i] = (births - matAbt - ctt) + r*Im[i] - foi[i]*S[i] - d[i]*S[i] - da*S[i];
          //I
          dydt[agrps+i] = ctt + foi[i]*(Na[i]-I[i]) - d[i]*I[i] - da*I[i];
          //Im
          dydt[2*agrps+i] = matAbt - (foi[i] + r + d[i] + da) * Im[i];
          
        } else if(i>1){
          //S
          dydt[i] = da*S[i-1] + r*Im[i] - foi[i]*S[i] - d[i]*S[i] - da*S[i];
          //I
          dydt[agrps+i] = da*I[i-1] + foi[i]*(Na[i]-I[i]) - d[i]*I[i] - da*I[i];
          //Im
          dydt[2*agrps+i] = da*Im[i-1] - (foi[i] + r + d[i] + da) * Im[i];
        } else{
          //S
          dydt[i] = da*S[i-1] + r*Im[i] - foi[i] * S[i] - d[i]*S[i];
          //I
          dydt[agrps+i] = da*I[i-1] + foi[i]*(Na[i]-I[i]) - d[i]*I[i];
          //Im
          dydt[2*agrps+i] = da*Im[i-1] - (foi[i] + r + d[i]) * Im[i];
        }
      }
      return(dydt);
    }
}

data {
  //for model calculations
  int K; //no. state variables
  int agrps;  //no. age groups
  vector[agrps] age_prop;  //proportion of population in each age group
  int tot_pop;  //total population size
  
  vector[agrps] age;
  real r;
  real da;
  real mctr[3];
  real propfert[agrps];
  real d[agrps];
  
  //data to fit
  int<lower=0>cases[agrps];  //no. positive cases
  int<lower=0>n[agrps];  //denominator for each age group

  //simulation
  real t0;  //starting time
  int t;  //no. years to run model
  real ts[t];  //time bins
  real rel_tol;
  real abs_tol;
  int max_num_steps;
  int inference;  //simulate w/o data (inference==0) or with data (inference==1)
  int doprint;
  
  //formatting ode results
  int data_agrps;
  int data_rows[data_agrps*K];
}

// transformed data {
//   real x_r[2+agrps*2] = {r_array};   //r, da, d, propfert
//   int x_i[2] = {agrps, K};   // agrps, K (no. state variables), S_r (no. real arrays), starts_r
// }

//parameters accepted by the model [that we want values for]
parameters {
  real<lower=0, upper=0.2>lambda0;
  real<lower=0, upper=0.2>lambda1;
  // real<lower=0, upper=0.2>gradient;
  // real<lower=0, upper=0.01>shape;
}

transformed parameters {
  // change of format for integrate_ode_rk45
  vector[agrps*K] init;  //initial values
  
  vector[agrps*K] y[t];   //raw ODE outputs
  vector[agrps] comp_S;
  vector[agrps] comp_I;
  vector[agrps] comp_Im;
  vector[agrps] comp_pI;
  
  for(i in 1:agrps){
    init[i] = age_prop[i];  //proportion in S0
    init[agrps+i] = 0;      //proportion in I0
    init[2*agrps+i] = 0;    //proportion in Im0
  }
  
  //run solver
  y = ode_rk45_tol(
    mod_si,  //model function
    init,    //vector initial values
    t0,      //real initial time
    ts,      //real[] times
    rel_tol,
    abs_tol,
    max_num_steps,
    lambda0,  //pass parameters directly into model
    // lambda1, gradient, shape, 
    // age, 
    agrps, K, //pass int data directly into model
    r, da, mctr, d, propfert //pass real data directly into model
    );  
    
    //extract and format ODE results
    comp_S  = (to_vector(y[t,1:agrps])) * tot_pop;  //total no. (age-stratified) in S
    comp_I  = (to_vector(y[t,(agrps+1):(2*agrps)])) * tot_pop;  //total no. (age-stratified) in I
    comp_Im = (to_vector(y[t,(2*agrps+1):(3*agrps)])) * tot_pop;   //total no. (age-stratified) in Im
    
    //extract particular age groups specified in the data
    // comp_S  = (to_vector(y[t,data_rows[1:16]])) * tot_pop;  //total no. (age-stratified) in S
    // comp_I  = (to_vector(y[t,data_rows[17:32]])) * tot_pop;  //total no. (age-stratified) in I
    // comp_Im = (to_vector(y[t,data_rows[33:48]])) * tot_pop;   //total no. (age-stratified) in Im
    
    //compute seroprevalence
    for(j in 1:agrps){
      comp_pI[j] = (comp_I[j] + comp_Im[j]) / (comp_S[j] + comp_I[j] + comp_Im[j]);
    }
}

model {
  lambda0 ~ lognormal(log(.1), .01);
  // lambda1 ~ lognormal(log(.1), .01);
  // gradient ~ lognormal(log(.1), .01);
  // shape ~ lognormal(log(.1), .01);
  
  //debug
  if(doprint==1) {
    print("lambda0: ", lambda0);
    print("comp_S: ", comp_S);
    print("comp_I: ", comp_I);
    print("comp_Im: ", comp_Im);
    print("comp_pI: ", comp_pI);
  }
  
  // likelihood 
  if (inference==1) {
    for(i in 1:agrps) { 
    target += binomial_lpmf(cases[i] | n[i], comp_pI[i]);
    }
  }
}

This is the rather confusing error message you get in Stan versions that don’t support the new ODE interface. You need to install CmdStanR because the latest RStan is a bit behind.

Which variable?

Stan saves variables in the parameters, transformed parameters, and generated quantities blocks. NaN parameters would be unusual, and you don’t have a generated quantities block, so I’m guessing this is something in transformed parameters.

The lingo with transformed parameters is that by default all the variables there are initialized to NaN, so seeing a bunch of NaNs on the output makes me think that one of the transformed parameters isn’t having anything written to it. But then looking at the code it seems to me like they should all be initialized. But I’m probably overlooking something so what variable is it that has NaNs?

Apologies, I wasn’t too clear in my last message. The variables with some NaNs are comp_S, comp_I, comp_Im and comp_pI, which I print in the transformed parameters block using this code:

  if(doprint==1) {
    print("lambda0: ", lambda0);
    print("comp_S: ", comp_S);
    print("comp_I: ", comp_I);
    print("comp_Im: ", comp_Im);
    print("comp_pI: ", comp_pI);
  }

I used this approach with RStan and didn’t get these NaNs. I would put this code in generated quantities, except that I need comp_pI for my likelihood function.

Ah, I see. The odd thing is that I have CmdStanR installed and up-to-date. The error message only occurs if I compile the stan code by clicking the Check button in RStudio. If I instead compile the code from an R script using the standard code (mod <- cmdstan_model(file)), it doesn’t throw an error…

That is strange. Those variables are being assigned from y (the output of the ODE solver). I wonder if the NaNs are coming from y? Or there is something weird about the assignments we are not noticing?

Can you check if there are NaNs in y? There is an is_nan function if you want to do this programmatically: https://mc-stan.org/docs/2_25/functions-reference/logical-functions.html

Yep so I wanted to take the output from the integrator and select certain values to use in the likelihood function. This is something I was doing in the model recently, with no NaNs produced, so it’s very bizarre that they’re suddenly appearing.

Yep good idea. So it turns out if I print y at the first time step (t=1), the vector is comprised of 3 lots of [37 NaNs followed by 363 non-NaNs]. I feed in initial values (init) as I always have done, using the variable age_prop for the first 400 elements, and setting the remaining 800 elements to be 0 (you can see this in the model code I posted).

(With increasing time of integration, a larger proportion of the y array becomes NaN, likely because my ODEs posit that different states depend on each other)

If I adjust the initial values so that the first 400 elements are larger numbers (0.5 instead of between ~0.002 and 8.5 \cdot 10^{-5}), then at t=1, the vector is comprised of 3 lots of [79 NaNs followed by 321 non-NaNs].
Perhaps for some reason it’s not liking the initial values provided?

Code for producing array containing elements equal to 1 if NaN, and equal to 0 if non-NaN (printing values at t=1):

    real na_y[t, K*agrps];
    for(i in 1:t){
      for(j in 1:(K*agrps)){
        na_y[i,j] = is_nan(y[i,j]);
      }
    }

if(doprint==1) {
    print("na_y: ", na_y[1, ]);
  }

Sounds like somewhere in the ODE function we’re ending up with NaNs.

A couple things to try:

  1. Compute your ODE right hand side at the initial conditions and see if you have any NaNs

  2. Add checks in your ODE right hand side to search for the NaNs (if the initial condition doesn’t show you where they’re coming from)

Maybe define a helper function:

functions {
  int count_nans_vec(vector v) {
    int out = 0;
    for(i in 1:cols(v)) {
      if(is_nan(v[i])) {
        out += 1;
      }
    }
    return out;
  }
}

And then you can add checks more easily and maybe hunt down better what is happening:

if(count_nans_vec(to_vector(A) > 0) {
  reject("Found NaNs in variable A:", to_vector(A));
}

Thanks Ben you’ve been very helpful!

Do you mean to do this outside of Stan?

In this case , what is vector A representing?

Well since we’re looking for bugs in the Stan model, so just do it in the Stan model. Might re-write this in R or something and accidentally get rid of the bug in translation!

Just anything you want to check for NaNs.

For instance here’s an opportunity for a divide by zero (which would make a NaN):

for(i in 1:agrps){
  pI[i] = (I[i] + Im[i])/Na[i];
}

SO maybe you do:

if(count_nans_vec(to_vector(pI)) > 0) {
  reject("Found NaNs in variable pI:", pI);
}

The vector type is just so you only have to write one NaN checking function.