neff, Rhat for tau_k[3] looks strange. No question, I can get around it. I donât need a
solution. Although it may be an indicator that something internal needs some checking.
Loading required package: ggplot2
Stackoverflow is a great place to get help: Newest 'ggplot2' Questions - Stack Overflow.
Loading required package: StanHeaders
rstan (Version 2.16.2, packaged: 2017-07-03 09:24:58 UTC, GitRev: 2e1f913d3ca3)
For execution on a local, multicore CPU with excess RAM we recommend calling
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
Ubuntu 16.04. gcc 5.4.0. default installation, no tweaks.
Yeah thatâs true. yhat[1] and yhat[3] are both constants, but one produces a NaN Rhat and one produces an Rhat < 1.0. And the n_effs are different as well, even though they are both constants close to each other.
The output of the first seems reasonable, but the second is weird. I do not know whatâs going on here.
I never managed to get the calculation for rhat and neff in PyStan to match the one in RStan exactly. It would be very, very nice to have these calculations done by Stan in C++ rather than in the interfaces. It bothers me that the results arenât the same across interfaces.
IIRC some of this code addresses a difference between R and Python: R doesnât complain about dividing a float by zero whereas Python raises a ZeroDivision exception.
This seems to be the closest related post I could find (also possibly related: Output for quantities of interest that are actually constants). I have a general question about posterior samples of constants. Sometimes they seem to yield NaN neffs and Rhats, and in the same model other constants has small (but defined) neff and Rhat of 1. The model converges fine and so this is not a critical issue (I assume). Iâm just curious what brings about the consistent output.
E.g., in the following output for which m_0 and S_ are user-provided means and covariance matrices of bivariate Gaussians (for two categories each, indicated by the first array dimension), and bias is a user-provided simplex of prior probabilities (for the two categories):
...
m_0[1,1] 14.00 NaN 0.00 14.00 14.00 14.00 14.00 14.00 NaN NaN
m_0[1,2] 138.35 0.00 0.00 138.35 138.35 138.35 138.35 138.35 2 1
m_0[2,1] 61.00 NaN 0.00 61.00 61.00 61.00 61.00 61.00 NaN NaN
m_0[2,2] 157.91 0.00 0.00 157.91 157.91 157.91 157.91 157.91 2 1
S_0[1,1,1] 81.00 NaN 0.00 81.00 81.00 81.00 81.00 81.00 NaN NaN
S_0[1,1,2] 106.89 0.00 0.00 106.89 106.89 106.89 106.89 106.89 2 1
S_0[1,2,1] 106.89 0.00 0.00 106.89 106.89 106.89 106.89 106.89 2 1
S_0[1,2,2] 11658.50 0.00 0.00 11658.50 11658.50 11658.50 11658.50 11658.50 2 1
S_0[2,1,1] 6561.00 NaN 0.00 6561.00 6561.00 6561.00 6561.00 6561.00 NaN NaN
S_0[2,1,2] -291.27 0.00 0.00 -291.27 -291.27 -291.27 -291.27 -291.27 2 1
S_0[2,2,1] -291.27 0.00 0.00 -291.27 -291.27 -291.27 -291.27 -291.27 2 1
S_0[2,2,2] 399.11 0.00 0.00 399.11 399.11 399.11 399.11 399.11 2 1
bias[1] 0.50 NaN 0.00 0.50 0.50 0.50 0.50 0.50 NaN NaN
bias[2] 0.50 NaN 0.00 0.50 0.50 0.50 0.50 0.50 NaN NaN
...
Iâm using rstan 2.21.2 (but this output inconsistency has been the case as long as I can remember). I wonder whether it is something that rstan might be able to catch? Thank you and sorry for reviving this old topic.
platform x86_64-apple-darwin17.0
arch x86_64
os darwin17.0
system x86_64, darwin17.0
status
major 4
minor 0.2
year 2020
month 06
day 22
svn rev 78730
language R
version.string R version 4.0.2 (2020-06-22)
nickname Taking Off Again
Can you provide Rdata file of the values which produce 1? Test also you get the same result after loading the values from the file you are going to send. I suspect floating point accuracy problem as was found that Eigenâs acov estimate can produce.
> library(tidyverse)
> x %>% filter(cat == 1, cue1 == 1) %>% summarise(m_0 = var(m_0))
`summarise()` regrouping output by 'cat', 'cue1' (override with `.groups` argument)
# A tibble: 2 x 4
# Groups: cat, cue1 [1]
cat cue1 cue2 m_0
<int> <int> <fct> <dbl>
1 1 1 VOT 0
2 1 1 f0 0
The R-hats are all slightly below 1 (0.9996666) before and after loading. Apologies if this doesnât address your suggestion (and thank you for your reply).
The minimal set so that I can reproduce what you observe.
I havenât used tidybayes. How do you compute Rhat and ESS for this object? is the output in your earlier post from tidybayes? Do you know which code it does use for Rhat and ESS computation? Its own or code from RStan or code from posterior package?
If you load the model object (from the second RData file I linked), and you summarize it, itâll give you rhat, Neff, etc. You can also use rhat(model) once you have loaded it.
I only brought up tidybayes as one way to extract the posterior samples, and to confirm that they have zero variance (as they should, since these are variables that are constants).
Correct, @hhau (and thank you). This is a class that adds a few slots to the stanfit class, but these slots are independent of the issue I raised.
@avehtari, I didnât think of this issue (since I have the library that creates the new class, it didnât show on my side). But the issue with the Neff and rhats should still show the way that hhau suggested. Does it for you? If not I can recreate and example with just rstan.
Rhat=1 and ESS equal to the sample size would be fine if we know these are constants. However, what Stan returns doesnât contain information which variables are truly constant and which are not truly constant. Itâs possible that for truly non-constant we observe just by chance that all posterior draws are equal. In that case Rhat and ESS are not well defined. As we canât make difference between these cases it was decided that for all equal draws Rhat and ESS should be NA. New posterior package does this
posterior::summarise_draws(posterior::as_draws(new_stanfit_obj))
...
6 m_0[1,1] 14 14 0 0 14 14 NA NA NA
7 m_0[2,1] 61 61 0 0 61 61 NA NA NA
8 m_0[1,2] 138. 138. 0 0 138. 138. NA NA NA
9 m_0[2,2] 158. 158. 0 0 158. 158. NA NA NA
...
CmdStanR is already using the posterior package. RStan probably will start using it at some point.