Tail_ESS questions

I have a few questions about tail_ESS and the appropriate information to take from it. I do want to thank the authors of the rank-normalization paper (https://arxiv.org/pdf/1903.08008.pdf) as it was very helpful.

I am working through diagnostics on a Bernoulli model with a number of group-level and population-level effects, using weakly- / reasonably-informative priors, and wanted to ask a few general questions to help me work through diagnostics (and figure out which diagnostics to work through).

First, I have noticed that with population-level (i.e., “fixed”) effects for my Bernoulli models, the Bulk_ESS virtually always outnumbers the Tail_ESS, sometimes substantially. On the other hand, for the varying intercept group-level (i.e., “random”) effects, their reported Tail_ESS virtually always outnumbers the Bulk_ESS, often substantially. Is this generally what one would expect for a positive-constrained parameter, or is that inherently concerning in some way?

Second, the paper cited above recommends 4+ chains to ensure good mixing. cmdstan, however, allows one now to use something like 2 chains and 2+ threads. Often, I find that using 2 chains but multiple threads is a faster option, including for hierarchical models. Does the availability of multi-threading individual chains alter the standard recommendation of using at least 4 chains? In other words, does the fact that fewer chains can be broken into multiple shards provide any additional robustness that might make using fewer than 4 chains acceptable for diagnostic purposes?

Third, and finally, Aki has some cool plotting functions in the online appendix (https://avehtari.github.io/rhat_ess/rhat_ess.html) to this article that appear on GitHub but I haven’t seen them elsewhere. Have those functions or equivalents been incorporated into any of the various mcmc plotting packages? I know an effort was made to adopt some of the rank normalization plotting displays in bayesplot and elsewhere. I am particularly interested in diagnostics that would show information similar to the plot_quantile_ess, plot_change_ess, and especially the mcmc_hist_r_scale functions.

Many thanks.

The threading doesn’t help with the multiple chain diagnostics, so 4 chains is better diagnostics than 2 chains. The number of threads used doesn’t matter.

I’ll leave the rest to @avehtari !

1 Like

@bbbales2 I was pretty sure that was going to be the answer re number of chains, but thought I would give it a shot. Thx.


1 Like