Gibbs post-processing to find unknown K in mixture model

emiruz · September 17, 2019, 9:45am

Hey thanks for the reply. Averaging over the discrete parameters seems like the canonical advise. In the example above K \approx \sum_{h=1}^H 1_{n_h}>0 (i.e. the number of components with at least 1 allocated data point) is ultimately distributed according to Categorical(\lambda), so one should be able to average over the discrete components and then just use \lambda to work out a distribution for K.

Two problems are identifiability and priors. Betancourt’s excellent piece of identifiability was very useful. He says that one can use ordering constraints to achieve identifiability (which is partially contracted by BDA3 that says label switching issues can persist nonetheless). Let’s say for arguments sake this is solveable.

The other problem is priors, in your model above the prior on your simplex p is all important and will decide how the H components are weighted. BDA3 recommends using Dirichlet(\frac{1}{H}) to force concentration (allocation to fewer clusters). However, like any prior, as the amount of data increases, the data overwhelms the prior and we end up with allocations all along H because more components will definitely lead to a higher log posterior probability. Meanwhile, if the amount of data is small then “where does your prior come from?” seems like an absolutely valid question because the prior becomes instrumental to deciding how many clusters there will be and a principled choice is important.

I’ve not yet encountered a principled way to approach "mixture models with unknown K" generically. To me it seems that I’ll just have to work harder to add lots more structure.

Topic		Replies	Views
Mixture model with unknown k Modeling	7	1505	November 14, 2017
Any suggestions to speed up mixture model? (dirichlet process) Modeling	1	2528	July 15, 2017
Mixtures for model selection Modeling	6	1198	April 1, 2018
Dirichlet_log: probabilities is not a valid simplex Modeling	26	3753	September 27, 2017
Mixture of Multivariate Bernouilli Modeling	5	932	September 29, 2019

Gibbs post-processing to find unknown K in mixture model

Related topics