Simple intercept only hierarchical model with two groups: deadly slow, poor convergence

betanalpha · October 1, 2021, 6:45pm

Perhaps the most important consequence of true metastability is that the parameter space will decomposes into non-overlapping “basins” of attraction (typically the neighborhoods surrounding each mode). Markov chains initialized in a basin will be confined to that basin, exploring the truncated posterior within, for many, many iterations. Often so many that you never see a Markov chain transition from one basin to another.

In this case each Markov chain should look like it’s exploring reasonably well, with fuzzy trace plots and reasonably small empirical autocorrelations. Markov chains initialized within the same basin should converge to the same exploration, and Rhat, or equivalent diagnostics, run just on those chains shouldn’t indicate any problems. Markov chains initialized in different basins, however, should converge to different explorations that trigger Rhat problems.

Because we usually don’t know where the basics of attraction actually are we can’t control these circumstances. All we can do in practice is

Run multiple Markov chains initialized as diffusely as possible. The breadth of the prior usually, but not often, is a useful target.
Verify that each Markov chain has adapted to a reasonable sampler configuration and that the trace plots show reasonable exploration from iteration to iteration.
Then verify that the Markov chains are exploring different areas, for example visually with trace plots or with Rhat for any functional output of interest.

In this case (2) was suspicious so we couldn’t jump to (3).

Separating out the variance of study and species with normal population models and a normal observational model will always yield a unimodal posterior distribution, I believe. The common issue here is not a discrete degeneracy (multiple modes which case metastable Markov chains) but rather a continuous degeneracy which is diagnosed and moderated in different ways. See for example the discussion in Identity Crisis.

Topic		Replies	Views
DHGLM with two (nested) random intercept Modeling rstan , fitting-issues , specification	3	422	May 8, 2020
Regression with random intercept for groups RStan	2	1469	August 29, 2018
Model struggles to systematically estimate simple three estimate model Modeling	2	291	August 18, 2023
Question about hierarchical effects rstanarm	2	776	February 15, 2018
Simple intercept only model with poor chain mixing Modeling rstan	6	543	September 17, 2021

Simple intercept only hierarchical model with two groups: deadly slow, poor convergence

Related topics