Identifiability with Mixture Models

Jennifer · November 30, 2017, 9:52am

Hi,

I am trying to get some information about identifiability with Bayesian inference and especially with mixture models. I have found a paper on this topic (Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modeling) but I feel like I am not any smarter than before.

Jasra, Holmes and Stephens write:

“One of the main challenges of a Bayesian analysis
using mixtures is the nonidentifiability of the components.
That is, if exchangeable priors are placed upon
the parameters of a mixture model, then the resulting
posterior distribution will be invariant to permutations
in the labelling of the parameters. As a result,
the marginal posterior distributions for the parameters
will be identical for each mixture component. Therefore,
during MCMC simulation, the sampler encounters
the symmetries of the posterior distribution and the
interpretation of the labels switches. It is then meaningless
to draw inference directly from MCMC output
using ergodic averaging. Label switching significantly
increases the effort required to produce a satisfactory
Bayesian analysis of the data, but is a prerequisite of
convergence of an MCMC sampler and therefore must
be addressed.”

Is there any way to explain this in a more simple way? Any help would be appreciated!

betanalpha · November 30, 2017, 2:48pm

https://betanalpha.github.io/assets/case_studies/identifying_mixture_models.html

Note that even once the label switching has been removed, exchangeable mixture models still exhibit more subtle non-identifiabilties that make them very hard to fit.

Bob_Carpenter · December 21, 2017, 10:35pm

If you have a mixture of two components

p(y | mu, sigma, lambda)
  = lambda * normal(y | mu[1], sigma[1]) 
    + (1 - lambda) * normal(y | mu[2], sigma[2])

The model isn’t identifiable as written because the parameter values

theta1 = (mu[1], sigma[1], mu[2], sigma[2], lambda)
theta2 = (mu[2], sigma[2], mu[1], sigma[1], 1 - lambda)

produce exactly the same likelihood value. If the prior for (mu[1], sigma[1]) is the same as that for (mu[2], sigma[2]), then you still have non-identifiability. What we normally recommend is an asymmetric prior that orders mu[1] < mu[2] to identify the model. Now, only one of theta1 or theta2 above is possible. Michael Betancourt (aka @betanalpha)'s case study he links has much more detail.

Topic		Replies	Views
Does this work for dealing with non-identifiability due to permutation symmetry? Modeling	3	469	October 1, 2018
Diagnosing convergence under label-switching Modeling fitting-issues , mixture , diagnostics	5	109	March 19, 2025
What does it mean to say that a model is "unidentifiable"? General fitting-issues	5	1790	June 22, 2020
Re: [stan-users] mixture model in manual Developers	0	707	December 29, 2016
Non-identifiability of mixing weights in a 2-component Gaussian model Modeling	5	548	July 20, 2018

Identifiability with Mixture Models

Related topics