Meaning of NaN for se_mean, n_eff and Rhat in Stan output

Hi,

Recently I tried to extract the output of my Stan program using ‘summary()’ function in R but I found that for some quantity (prob_d, which is the probability of success in Bernoulli trial) in ‘Generated Quantities’ the se_mean, nn_eff and Rhat are all NaN while the mean is still number.

Then I checked the matrix of all simulated draws and found that the maximum of draws for prob_d is 1.0000000 and the minimum is 1.898636e-17. Also the program ran without error but had the following warning message(4 chains with 5000 iterations for each chain):

Warning messages:
1: In validityMethod(object) :
  The following variables have undefined values:  prob_d. Many subsequent functions will not work correctly.
2: There were 6381 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup 
3: There were 2 chains where the estimated Bayesian Fraction of Missing Information was low. See
http://mc-stan.org/misc/warnings.html#bfmi-low 
4: Examine the pairs() plot to diagnose sampling problems
 
5: The largest R-hat is NA, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat 
6: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess 
7: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess 

I am wondering why this happened and does this relates to the arithmetic precision in Stan?

Thx!

se_mean depends on eff depends on Rhat, so if Rhat is NaN then the rest can blow up too.

You can get a NaN Rhat and a finite mean if the parameter/transformed parameter/generated quantity stays constant during the whole sampling process (zero variance).

The following variables have undefined values: prob_d.

That error message makes me think that some values of prob_d have not been assigned. In that case Stan by default assigns NaN. So something like this would produce a lot of NaNs:

generated quantities {
  real prob_d;
}

However in that case I would expect the mean to also be NaN.

That is surprising. Maybe it is 1 or zero in each chain? Either way, there’s definitely something not quite right with this model given this behavior (and the very large number of divergences), so I wouldn’t read too much into the NaN Rhats – just make some parameter plots and start debugging.

Edit: to be clear, I wouldn’t read too much into the NaN Rhats beyond the fact that the chains are not mixing* – it is meaningful than the calculation blew up just you’ll probably need to look at other things to really figure out what the problem is

I also add that future versions should not warn about NAs although you may still see them in the diagnostic summaries.