I need model selection of Gaussian mixture model

y~sum_{k=1:C} w_k * N(mu_k, sigma_k^2)

that is, to select the number of components C. To do so, the textbook <Bayesian Data Analysis, Edition 3> Chapter 22.4 added discrete variable Z_t where Z_t=k means the t-th data point belongs to the k-th Gaussian component, used Gibbs sampler to update all parameter blocks, and see how many components have been occupied by Z_t.

However, our problem has constraints

Ey=0=sum_k w_k * mu_k

var(y)=1=sum_k w_k * (mu_k^2+sigma_k^2)

So Gibbs for w, mu, sigma is not possible. I consider calling “sampling” in “rstan” package in the following for loop

Use “stan_model” to set up the model with parameters w, mu, sigma, and the data block includes Z because it is discrete.

for (t in 1:T){ #T iterations in total

(1) Update w, mu, sigma given Z, using 1 iteration of HMC or NUTS with “sampling”, with data input Z.(2) Gibbs update of Z given w, mu, sigma, using my own computed probability.

}

However, I’ve 2 concerns:

(1) Due to Z change, the tuning parameter from warm-up perhaps cannot persist along all iterations t. Maybe only hand tuning is possible?

(2) I wanna it not too slow due to repeated calling of “sampling”.

Thank you.