Convergence within chains, but not across chains

scijens · March 18, 2021, 4:40pm

Hello everyone,

I am running a Hidden Markov Model in Stan where some covariates drive the transitions between the latent states. Running the model on larger samples, I am still having issues with poor convergence in most of my parameters (as indicated by Rhat; in another post, it was recommended to apply the non-centered parameterization, which unfortunately didn’t solve the problem).

Looking at the summary output of the 2 chains I was running (each with adapt delta=.90, max_depth=12 num_samples=2000), convergence appears to be quite poor:

grafik

Looking at the summary of each chain separately, it looks ways different:

Chain 1:
grafik

Chain 2:
grafik

Now the traceplots of the 2 chains show what’s going on. Most of the parameters converge within a chain, but they converge to “slightly” (there is a difference, but this difference does not change the content-related implications I want to draw from this model) different values.

My questions are:

Am I correct in assuming that running this model again with adjustments to the computational parameters (iterations, adapt_delta, etc.) would do no good at all?
Is the convergence problem really as big as it seems if it doesn’t change the insights I want to generate with the model?
Any other recommendations on what I should try?

mike-lawrence · March 18, 2021, 5:38pm

You have encountered a common pathology whereby the posterior is multimodal and chains get “stuck” exploring only one mode. Often this occurs when two or more parameters in the model are “non-identified”, meaning an increase in one can be offset by a decrease in the other to yield the same likelihood as if neither had changed. Take at the pairs plots of the posterior samples; non-identified parameters will show a strong correlation.

scijens · March 18, 2021, 8:03pm

Thank you. I have a follow-up question: Let’s take these two parameters mu and nu (state-dependent intercepts of two equations in the 2-state model) as an example:

Is it necessarily problematic that mu[1] is correlated with mu[2] and nu[1] and nu[2]? In my specific example it would only mean that if the intercept of state 1 is higher in that equation, the intercept of state 2 is higher, too.

Or is the correlation between two parameters an issue per se?

Funko_Unko · March 20, 2021, 12:02am

I’ve been wondering the same thing, and I don’t know whether I have heard a good answer yet.

I guess the simplest example would be just a two parameter problem where prior and posterior are multivariate gaussians, but the prior looks like a circle, while the posterior looks like an extreme ellipse.

But even then, the posterior may either have shrunk considerably in all directions, which I guess would make the correlation unproblematic, or it may have only contracted in one direction, which would tell you that you have no information whatsoever about the other direction?

Topic		Replies	Views
Non-converging chains for a set of parameters Modeling	20	879	January 21, 2021
Low MC error per chain, but no convergence of chains Modeling	4	919	April 26, 2021
Convergence of multiple chains Modeling rstan , fitting-issues	5	351	March 27, 2024
Correlation of markov chains General	12	1183	April 12, 2024
Multivariate Hierarchical Model Modeling techniques , fitting-issues	7	974	April 7, 2021

Convergence within chains, but not across chains

Related topics