Non-convergent iterations between convergent iterations

My model has the following trace plot.
If such sampling occurs, then we can obtain the convergent MCMC chain by taking its subsequent.
This model makes sense? Or this model is meaning less?

mcmc

1 Like

Hello,

This patern can emerge if problems occur in other parameters. Perhaps a simple example can show this:

eight_school_example

In the eight schools example the mu parameter on the right seems to get stuck in the middle at around itteration 200. But the problem actually occurs in the tau parameter which gets stuck as we approach zero. To find which parameter can be the cause of the problem you could look at something like the effective sample size devided by the total number of itterations for all parameters. The parameter with the lowest values would be good to look at first.

Hope this helps.

Best, Duco

2 Likes

Thank you for reply.

I understand that in your example, if tau is close to zero, then the sampling is not mixed.
I need priors but now I cannot reveal suitable one.

Convergence is a property of the chain as a whole, not of individual iterations. That chain clearly has not converged, and it will not. Better priors aren’t going to fix that (unless you have been using really bad ones). You need to reparameterize.

1 Like

Thank you.

My model use two distributions for two random variables H,F:

H \sim \text{Binomial}(p(\theta), N)

F \sim \text{Poisson}(q(\theta))

where p(\theta),q(\theta) are positive functions of model parameter \theta.
Then, in this case, what is the reparameterization?


If I use the first part or the last part of MCMC samples in the above trace plot, then it converges in R hat criterion. Such sampling has make sense? or taboo?

Is theta the only parameter in the model? Convergence is global phenomenon, any problematic parameter can spoil all inferences. Coming up with a good reparameterization is difficult and depends on the form of the functions. I can’t promise anything but if you post the full code I’ll take a look.

What you’re seeing is called “pseudoconvergence”. “Convergence in R-hat criterion” cannot guarantee that the samples are drawn from the intended distribution. We just optimistically assume it does.
In this case I interpret the trace plot to mean that the target posterior has a wide basin and a sharp ridge and the transition between these is extremely difficult. When the sampler explores the basin it moves efficiently and looks like it has converged. But these samples do not represent the full posterior! When the sampler eventually finds the ridge, it gets stuck.
Maybe the ridge is just a pathological artifact of a bad prior and can be ignored. Or maybe the basin is the artifact and the ridge is the real region of interest. Or maybe the model is just that complicated and the desired posterior distribution contains both parts. Hard to say.

2 Likes

Hi @Jean_Billie did this get resolved? Or do you still need help? Also I noticed you are probably not a native English speaker - feel free to say if our writing is hard to understand for you.

Thank you for reply.

I cannot remove the non-convergent issues.
I can understand English but writing English is difficult :’-D
I am Japanese and no longer belong any university.
So, this site is very helpful.
But my Stan codes or modeling is very complicated and the model is new.
I just today submitted my manuscript in which I introduce my models with the above non-convergent issues :’-D.

Using coordinate transfomations \theta =\theta(\theta'), now,
I try the reparametrization f(y|\theta) = f(y|\theta(\theta')),
but the result is not desired one for me.

Unfortunately, the divergent iterations show that your results are likely not reliable. It is hard to provide specific guidance without seeing the model, and we might not have enough energy to investigate a complex model here, but I have made a list of things to try in such situations at https://www.martinmodrak.cz/2018/02/19/taming-divergences-in-stan-models/

Best of luck with your project!

1 Like