Mixture of Gaussian distributions for the random effects

Hi,

As usual, when fitting linear or generalized linear mixed models, we often assume that the random effects, for example in my case is only random intercepts, follow a normal distribution.

In my data set, this assumption is violated and I have to extend it by assuming a mixture of normal distributions.

However, in Stan manual I only find an example where we fit a mixture model for the response. In my case, the mixture is thought as a prior for the random intercepts.

I have searched but I have not found any example where we use a mixture of normal distributions for the random effects. Do you know somewhere we have an example that I can learn from?

Thank you,

Tran.

Just replace the data variable in whatever mixture model you’re looking at with your parameters. If you only have two components to your mixtures just use log_mix, http://andrewgelman.com/2017/08/21/mixture-models-stan-can-use-log_mix/ (section 13.5 of the 2.16 manual). The sampling statements don’t really care what’s on the left hand side, it all goes towards incrementing the target log probability (check section 5.3 of the 2.16 manual).

Is there any way to get around the mixtures? Any variables in your data that could explain these differences you expect?
Multimodal posteriors are a real pain to get to sample well, and a multimodal prior is sure gonna push you in that direction (unless you have enough data to override the multimodal prior, which means you probably could have gotten away with a lighter prior to begin with (instead of futzing with mixtures)).

Hi,

My problem with the data is called “Quasi-complete separation in random effects” as discussed in this paper.

Below is the histogram of the estimated means of the random effects when I assume a normal distribution even I used the covariates.

Histogram_RI_2a

From this figure and the suggestions from the paper that I have mentioned above, using a mixture of normal distributions is a possibility.

There is another possibility that uses transitional probability but the problem for my data is that it contains missing binary data, that I cannot put it on the right-hand side of any sampling statements.

Tran.

I’m not too familiar with the statistics here. Hopefully someone else can jump in if they recognize what is going on.

Is that a plot of a posterior of a model simulated in Stan? From the paper it sounds like these are distributions of estimates you might get if you’re working with logistic regressions where your covariates perfectly explain some of your data. I don’t see where they are priors. Are you wanting to use these estimates in something else?

From the end of section 2.3, it sounds like the authors are saying you can use priors to avoid problematic estimates that arise in these situations?

Maybe I’m reading it wrong though.

Hi,

When we assume that b ~ normal(0, sigma_b) then the mean estimates of the random effects, b, should form a histogram that look similar to a normal distribution.

The plot just shows that the assumption of a single normal distribution is violated. It might be better to extend to a mixture.

That paper is not in Bayesian so you do not see priors. Actually the prior here is the assumption we make on the distribution of the random effects.

Yesterday I fitted a simple linear mixed model with a mixture for random effects. At least I can run for this model. I will fit it for logistic regression model later today.

Thank you for your help!

Tran.

When we assume that b ~ normal(0, sigma_b) then the mean estimates of the random effects, b, should form a histogram that look similar to a normal distribution.

Nono, b ~ normal(0, sigma_b) doesn’t mean that b has to be distributed like a normal or anything. All the sampling statements do (sampling statements are the ones with the ~s) is increment the log probability, which HMC uses to explore parameter space. It’s not sampling from a normal, or anything like that.

You put a prior on b, but that doesn’t determine what the output distribution will look like. Especially if you have lots of data, you’d expect the prior to take a back seat :D.

This isn’t just a Stan issue. When you have a prior and then you condition on data, the posterior doesn’t have to look like the prior.