Perhaps the most important consequence of true metastability is that the parameter space will decomposes into non-overlapping “basins” of attraction (typically the neighborhoods surrounding each mode). Markov chains initialized in a basin will be confined to that basin, exploring the truncated posterior within, for many, many iterations. Often so many that you never see a Markov chain transition from one basin to another.
In this case each Markov chain should look like it’s exploring reasonably well, with fuzzy trace plots and reasonably small empirical autocorrelations. Markov chains initialized within the same basin should converge to the same exploration, and Rhat, or equivalent diagnostics, run just on those chains shouldn’t indicate any problems. Markov chains initialized in different basins, however, should converge to different explorations that trigger Rhat problems.
Because we usually don’t know where the basics of attraction actually are we can’t control these circumstances. All we can do in practice is
-
Run multiple Markov chains initialized as diffusely as possible. The breadth of the prior usually, but not often, is a useful target.
-
Verify that each Markov chain has adapted to a reasonable sampler configuration and that the trace plots show reasonable exploration from iteration to iteration.
-
Then verify that the Markov chains are exploring different areas, for example visually with trace plots or with Rhat for any functional output of interest.
In this case (2) was suspicious so we couldn’t jump to (3).
Separating out the variance of study and species with normal population models and a normal observational model will always yield a unimodal posterior distribution, I believe. The common issue here is not a discrete degeneracy (multiple modes which case metastable Markov chains) but rather a continuous degeneracy which is diagnosed and moderated in different ways. See for example the discussion in Identity Crisis.