I’d like to explore refitting a toy model in Stan that I estimated about 10 years ago with the now defunct/exiled/archived R package DPpackage. It is a binomial mixture model with a mixture of Dirichlet process prior. The data are from Beckett and Diaconis (1994), and consist of 320realizations of the following binomial experiment: common thumbtacks are "flicked"9 times and the number of times the tack lands point-up is recorded. There were various kinds of tacks, surfaces and flickers so one shouldn’t expect only one binomial probability. My earlier fitting used a MDP prior of the form D(a G_0) with G_0 beta(1,1) and a = 1. The data is available from the R package REBayes, ?tacks there provides some further details. This all seems rather
simplistic, but I’m concerned about comments in the case study of Betancourt (2017) about generic difficulties with Bayesian mixture models. Any pointers/advice would be welcome, including “abandon hope all ye who enter here.”
Stan has fixed parameter sizes, so we can’t technically represent DP priors. On the other hand, a finite approximation is almost always reasonable, so that shouldn’t be a blocker.
Can you write down the likelihood and prior for the model you want to fit in math? Then it should be easy to code in Stan and you can see what happens.
This isn’t Bayes-specific. Mixture models are classically non-identified because of label switching (e.g., permute the identities of the mixture components and the likelihood doesn’t change). In Bayesian inference, this is problematic when a sampler jumps across modes. The downstream inference can still be OK, but trying to diagnose MCMC convergence is challenging. I put a short discussion of this in the user’s guide mixture modeling chapter.
Thanks for the response. The likelihood in this case is just a mixture of Binomials,
$$L(G) = \prod_{i=1}^n \int_0^1 \binom{n_i , p_i) p_i^{y_i} (1-p^{n_i - y_i} dG(p)$$ and the prior is a mixture of DP, usually expressed as D(a G_0) with G_0 Beta(1,1) and a = 1. There is a brief discussion of the data and model in Section 7 of (https://www.tandfonline.com/doi/abs/10.1080/01621459.2013.869224)
The curious aspect of this is that there is a unique nonparametric MLE.