Binomial_lpmf: Probability parameter[1] is 1, but must be in the interval [0, 1]

Ah, I see. The odd thing is that I have CmdStanR installed and up-to-date. The error message only occurs if I compile the stan code by clicking the Check button in RStudio. If I instead compile the code from an R script using the standard code (mod <- cmdstan_model(file)), it doesn’t throw an error…

That is strange. Those variables are being assigned from y (the output of the ODE solver). I wonder if the NaNs are coming from y? Or there is something weird about the assignments we are not noticing?

Can you check if there are NaNs in y? There is an is_nan function if you want to do this programmatically: https://mc-stan.org/docs/2_25/functions-reference/logical-functions.html

Yep so I wanted to take the output from the integrator and select certain values to use in the likelihood function. This is something I was doing in the model recently, with no NaNs produced, so it’s very bizarre that they’re suddenly appearing.

Yep good idea. So it turns out if I print y at the first time step (t=1), the vector is comprised of 3 lots of [37 NaNs followed by 363 non-NaNs]. I feed in initial values (init) as I always have done, using the variable age_prop for the first 400 elements, and setting the remaining 800 elements to be 0 (you can see this in the model code I posted).

(With increasing time of integration, a larger proportion of the y array becomes NaN, likely because my ODEs posit that different states depend on each other)

If I adjust the initial values so that the first 400 elements are larger numbers (0.5 instead of between ~0.002 and 8.5 \cdot 10^{-5}), then at t=1, the vector is comprised of 3 lots of [79 NaNs followed by 321 non-NaNs].
Perhaps for some reason it’s not liking the initial values provided?

Code for producing array containing elements equal to 1 if NaN, and equal to 0 if non-NaN (printing values at t=1):

    real na_y[t, K*agrps];
    for(i in 1:t){
      for(j in 1:(K*agrps)){
        na_y[i,j] = is_nan(y[i,j]);
      }
    }

if(doprint==1) {
    print("na_y: ", na_y[1, ]);
  }

Sounds like somewhere in the ODE function we’re ending up with NaNs.

A couple things to try:

  1. Compute your ODE right hand side at the initial conditions and see if you have any NaNs

  2. Add checks in your ODE right hand side to search for the NaNs (if the initial condition doesn’t show you where they’re coming from)

Maybe define a helper function:

functions {
  int count_nans_vec(vector v) {
    int out = 0;
    for(i in 1:cols(v)) {
      if(is_nan(v[i])) {
        out += 1;
      }
    }
    return out;
  }
}

And then you can add checks more easily and maybe hunt down better what is happening:

if(count_nans_vec(to_vector(A) > 0) {
  reject("Found NaNs in variable A:", to_vector(A));
}

Thanks Ben you’ve been very helpful!

Do you mean to do this outside of Stan?

In this case , what is vector A representing?

Well since we’re looking for bugs in the Stan model, so just do it in the Stan model. Might re-write this in R or something and accidentally get rid of the bug in translation!

Just anything you want to check for NaNs.

For instance here’s an opportunity for a divide by zero (which would make a NaN):

for(i in 1:agrps){
  pI[i] = (I[i] + Im[i])/Na[i];
}

SO maybe you do:

if(count_nans_vec(to_vector(pI)) > 0) {
  reject("Found NaNs in variable pI:", pI);
}

The vector type is just so you only have to write one NaN checking function.

Your nifty NaN checking function identifies NaNs in the state variables (most of which are in S), which explains NaNs in the ODE output.

The problem now is that I’m unclear how NaNs are appearing in the states in the first place. A few Qs that might help clairfy:

  • Is it wrong to redefine the states y as different variables (in this case, S, I and Im) in the function that calculates derivatives? I find it much easer to read this way, but wondered if that interfered with any calculations being made.

  • When initial values are specified for ode_rk45_tol, does each element of the init vector take the corresponding element of the state vector y which, in turn, is fed into the function that calculates derivatives?

I’m unclear how there could be opportunities for NaNs to be produced even before the state variables are fed into any for loops or functions.

Thanks again Ben.

This looks correct to me:

vector[agrps] S = y[1:agrps];
vector[agrps] I = y[(agrps+1):(2*agrps)];
vector[agrps] Im = y[((2*agrps)+1):(3*agrps)];

Yeah y(t = 0) = init

Yeah me too it’s not clear what is happening. I wonder if the solver is failing somewhere? At this point can you simplify the problem as much as possible (so that you’re still getting NaNs) and post code for a reproducible example? In particular, it would be convenient if that 400 dimension was like 2 so it’s easier to debug lol.

You can write the data you supply to the Stan model to a json file with write_stan_json (Write data to a JSON file readable by CmdStan — write_stan_json • cmdstanr), so then you just need to attach a .json and a .stan file here.

This is because the RStudio check still uses RStan to check the syntax.

1 Like

Ah this makes sense, thank you.

Hard to say because if I run the MCMC it just throws errors disliking the initial value (presumably because NaN):

Chain 1 Rejecting initial value:
Chain 1 Gradient evaluated at the initial value is not finite.
Chain 1 Stan can't start sampling from this initial value.

Thanks, yep I’ll give that a go!

1 Like

Quick update:

I had one last stab at trying to figure out the source of the NaNs before I would do a reproducible example, and I’ve figured it out. They’re getting produced in the for loop as shown in the commented code:

      for(i in 1:(agrps-3)){
        if(i==1){
          dprev[i] = 0;
          seroconv1[i] = 0;
          seroconv2[i] = 0;
          seroconv3[i] = 0;
          ct1[i] = 0;
          ct2[i] = 0;
          ct3[i] = 0;
          matAb1[i] = 0;
          matAb2[i] = 0;
          matAb3[i] = 0;
          
        } else{
          dprev[i] = pI[i]-pI[i-1];                //change in prevalence (must be positive)
          seroconv1[i] = dprev[i]*c1[i];           //pregnant women seroconverting in trimester 1
          seroconv2[i] = dprev[i]*c2[i];           //pregnant women seroconverting in trimester 2
          seroconv3[i] = dprev[i]*c3[i];           //pregnant women seroconverting in trimester 3
          
          /*   !!  These generate the NaNs    !! */
          //generates 3 NaNs at indices 2:4
          ct1[i+3] = seroconv1[i]*mctr[1];         //likelihood of transmission trimester 1
          //generates 2 NaNs at indices 2:3
          ct2[i+2] = seroconv2[i]*mctr[2];         //likelihood of transmission trimester 2
          //generates 1 NaN at index 2
          ct3[i+1] = seroconv3[i]*mctr[3];         //likelihood of transmission trimester 3
          //generates 3 NaNs at indices 2:4
          matAb1[i+3] = seroconv1[i]*(1-mctr[1]);  //maternal Ab trimester 1
          //generates 2 NaNs at indices 2:3
          matAb2[i+2] = seroconv2[i]*(1-mctr[2]);  //maternal Ab trimester 2
          //generates 1 NaN at index 2
          matAb3[i+1] = seroconv3[i]*(1-mctr[3]);  //maternal Ab trimester 3
        }
      }

Is there a way to index with i+n in any Stan for-loop without generating NaNs? In R the indices that aren’t looped over become 0 but I suppose this isn’t the case in Stan?

Hi Ben, I wonder if you could help with something?

My model now returns no NaN values, but when I run it through a short MCMC chain (iter_warmup = 5 & iter_sampling = 10) it just runs eternally with no read-out (even though I’ve set refresh = 1 in mod$sample() options). Is this something that occurs regularly with other Stan users? I’ve had a browse on the forums but can’t seem to see any instances of my particular case: most seem to at least print the chain’s progress, even if it doesn’t move past 0% of warmup sampling.

Any help very much appreciated, as always.

Cheers
Greg

Oh sorry I didn’t get back to you on the other thing. I read it and then blanked on responding.

Any variable defined in the transformed data, transformed parameters, model, or generated quantities block will default to NaN unless you specify otherwise.

Does this explain the source of the NaNs? If you zero’d everything before using it would this give you the behavior you expect?

It’s possible. This could happen if your model is very slow to evaluate.

The way to approach this is probably some sort of workflow thing (https://statmodeling.stat.columbia.edu/2020/11/10/bayesian-workflow/).

Like you want to write this model and evaluate it, but there are going to be problems (like when the syntax finally gets worked on in a way that the model can run, it takes forever to run).

The reality is there’s probably no way to just write out a big complicated model and have it run :D. Gotta build up from smaller models.

How to build up from smaller models? Well you could try a simpler scientific model, or you could try to write a model without the ODE just cause they can be hard to program, or you could estimate less parameters cause that seems like a simpler statistical problem.

I suspect if you simplified your model to not have an ODE, it would run fast. And then how would you simplify the model and not remove the ODE?

You could hold all the parameters constant that go into the ODE and estimate the others. If that doesn’t work, then probably pull the ODE out and investigate it outside of Stan to make sure craziness isn’t happening, or ask on the forums.

If the model works with fixed parameters, try one non-fixed parameter. Etc. etc. Is that useful?

1 Like

Not quite — this variable is specified in the function block.

I can omit the NaNs from the variable because the first few indices (where NaNs are) aren’t required for the model calculations), so behaviour here isn’t the issue, rather I want a more elegant solution.
I tried doing the following to ‘zero’ indices of the variable where there are NaNs, after the relevant calculations:

for(i in 1:cols(v)) {
      if(is_nan(v[i])) {
        v[i]=0;
      }
    }

Is there a way to initialise a variable with zeros instead?

Thanks, this was simple but effective advice — I simplified my model and got it to fit, finally. I suspect the more complex model would fit but would just require a lot more time. Cluster, it is…

Still build up the model in pieces. Brute forcing these things is not good – cause they really can take a long long time.

Also a lot of the times a hard to compute model is also one that has statistics problems. So the ticket is add stuff until things get slow, and then try to figure out if there’s an application interpretation for what is causing that slowness. I’m not saying that well, but check chapter 11 of https://statmodeling.stat.columbia.edu/2020/11/10/bayesian-workflow/.

2 Likes

Thanks Ben, you’ve been super helpful. I’ll check out the chapter.