Multi-modal posteriors, step size and tree depth

saudiwin · February 11, 2022, 6:49am

Hi all -

As you may know, I have an R package that does time-varying latent variable models with Stan (GitHub - saudiwin/idealstan: idealstan offers item-response theory (IRT) ideal-point estimation for binary, ordinal, counts and continuous responses with time-varying and missing-data inference. Latent space model also included. Full and approximate Bayesian sampling with 'Stan' (www.mc-stan.org).). So I’ve seen a lot of the problems with identifiability. However, there is something new that I’m picking up that I hadn’t noticed before, and that is convergence to modes with different log-posterior values. For example, I just ran a model with two chains that ended up with the step sizes/treedepth values in the attached files.

As can be seen, one chain converged to a more likely mode with a lower treedepth and higher stepsize. The other chain found a less likely mode that required longer transitions.

Here’s the question–when dealing with this kind of modality (i.e., the model has an invariance to rotations problem), does it make sense to always select the chain with the higher log-likelihood, especially if the treedepth is lower? It would seem that in this case one chain is clearly superior to the other.

betanalpha · February 14, 2022, 3:15am

No – the value of the posterior density function does not determine the importance of mode. What would matter is the total posterior probability that concentrates around/within the mode.

Unfortunately that typically cannot be accurately estimated using posterior samples unless the Markov chains are able to transition between the modes sufficiently often. When each Markov chain is restricted to one mode there is no information to determine the relative contributions of the two modes.

Note also that the number of Markov chains that converge to each mode also isn’t informative as that depends too much on the initialization (I briefly discuss this in Simple intercept only hierarchical model with two groups: deadly slow, poor convergence - #13 by betanalpha).

saudiwin · February 14, 2022, 5:40am

Thanks for the feedback Michael!

Topic		Replies	Views
Convergence of multiple chains Modeling rstan , fitting-issues	5	348	March 27, 2024
Convergence within chains, but not across chains Modeling fitting-issues , stan	3	1413	March 20, 2021
Multiple Chains do not converge Modeling	1	2091	July 4, 2019
Memory retention case study: modeling full individual differences RStan cognitive-science	1	552	January 13, 2020
Multi-modality of posteriors Modeling fitting-issues	3	816	December 10, 2018

Multi-modal posteriors, step size and tree depth

Related topics