Meaning of NaN for se_mean, n_eff and Rhat in Stan output

stan_beginer · January 8, 2021, 10:06pm

Hi,

Recently I tried to extract the output of my Stan program using ‘summary()’ function in R but I found that for some quantity (prob_d, which is the probability of success in Bernoulli trial) in ‘Generated Quantities’ the se_mean, nn_eff and Rhat are all NaN while the mean is still number.

Then I checked the matrix of all simulated draws and found that the maximum of draws for prob_d is 1.0000000 and the minimum is 1.898636e-17. Also the program ran without error but had the following warning message(4 chains with 5000 iterations for each chain):

Warning messages:
1: In validityMethod(object) :
  The following variables have undefined values:  prob_d. Many subsequent functions will not work correctly.
2: There were 6381 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup 
3: There were 2 chains where the estimated Bayesian Fraction of Missing Information was low. See
http://mc-stan.org/misc/warnings.html#bfmi-low 
4: Examine the pairs() plot to diagnose sampling problems
 
5: The largest R-hat is NA, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat 
6: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess 
7: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess

I am wondering why this happened and does this relates to the arithmetic precision in Stan?

Thx!

bbbales2 · January 9, 2021, 10:42pm

se_mean depends on eff depends on Rhat, so if Rhat is NaN then the rest can blow up too.

You can get a NaN Rhat and a finite mean if the parameter/transformed parameter/generated quantity stays constant during the whole sampling process (zero variance).

The following variables have undefined values: prob_d.

That error message makes me think that some values of prob_d have not been assigned. In that case Stan by default assigns NaN. So something like this would produce a lot of NaNs:

generated quantities {
  real prob_d;
}

However in that case I would expect the mean to also be NaN.

That is surprising. Maybe it is 1 or zero in each chain? Either way, there’s definitely something not quite right with this model given this behavior (and the very large number of divergences), so I wouldn’t read too much into the NaN Rhats – just make some parameter plots and start debugging.

Edit: to be clear, I wouldn’t read too much into the NaN Rhats beyond the fact that the chains are not mixing* – it is meaningful than the calculation blew up just you’ll probably need to look at other things to really figure out what the problem is

avehtari · January 20, 2021, 12:52pm

I also add that future versions should not warn about NAs although you may still see them in the diagnostic summaries.

Topic		Replies	Views
Rhat calculation fails in for sentence? Modeling fitting-issues	2	541	November 30, 2021
Simple Example of an Improper Posterior Without Warnings from Stan General	22	2797	December 7, 2019
Rhat and ESS from the print(stan_fit) different from the functions General	4	1592	October 3, 2020
In the estimation results, se_mean, n_eff, and Rhat are "NaN", why? General	6	530	May 14, 2021
Output for quantities of interest that are actually constants Modeling	0	348	March 6, 2019

Meaning of NaN for se_mean, n_eff and Rhat in Stan output

Related topics