# Second Level Mixture for clustering

Hi, I’m trying to set up a two level Poisson regression in which the second level coefficients should have a normal mixture prior in order to be able to create clusters based on the second level grouping.
More specifically, I have frequencies of use of different websites from several days from several people, and in the first level I’m adding some day level variables (like for example the week and if they had an exam that same day), and i have individual level information in the second level. I want to cluster the individuals, so I want to make a mixture on the coefficients of that level. I’m new with brms, and I can’t seem to find how to include the mixture of normal as a prior, and not as a family. For now I thought about setting the Family to be a poisson mixture, and add normal priors at the second level. Something like this:

``````ml_formula <- bf(y ~ 1 + b1 + b2*week + b3*exer + b4*teach + b5*exam + b6*p1_w b7*p2_w + b8 * p3_w + b9 * p4_w,
mvbind(b1, b2, b3, b4, b5) ~ (1+ p1+p2+p3+p4+demografic1|w|mm(student,website)),
)
ml_prior <- c(prior(normal(0,2), resp="b1", coef = "b1", dpar = "mu1"),
prior(normal(0,2), resp="b1", coef = "b1", dpar = "mu2"),
prior(normal(0,2), resp="b2", coef = "b1", dpar = "mu1"),
prior(normal(0,2), resp="b2", coef = "b1", dpar = "mu2"),
prior(normal(0,2), resp="b3", coef = "b1", dpar = "mu1"),
prior(normal(0,2), resp="b3", coef = "b1", dpar = "mu2"),
prior(normal(0,2), resp="b4", coef = "b1", dpar = "mu1"),
prior(normal(0,2), resp="b4", coef = "b1", dpar = "mu2"),
prior(normal(0,2), resp="b5", coef = "b1", dpar = "mu1"),
prior(normal(0,2), resp="b5", coef = "b1", dpar = "mu2"),

mix <- mixture(poisson,poisson)
fit1 <- brom(formula= ml_formula, data = data, family = mix, prior = ml_prior)
``````

Am I doing the right thing? Thanks in advance for the help!

Sorry for taking so long to answer.

I am not sure you are approaching this from a good starting point. Mixture priors feel to me a bit weird (and could likely cause computational problems). Why would you expect your domain expertise to be well captured by a mixture? And more generally, you seem to be building a very large and complex model that would need tons of data to inform all the parameters. Have you tried applying a simpler model first? We usually recommend to start with a simpler model and only if the model has problems fitting the data (as witnessed by posterior predictive checks or other similar diagnostics) to add additional structure.

Does that make sense?

Also note that you can use triple backticks (```) to format code in your posts (I took the liberty to edit your post to add this formatting already).

Thanks a lot for the reply. I have been exploring other options, like using a two stage analysis. The first one to extract features from the behaviour of each website using a regression, and later clustering using a latent class model. My issue here is that I’m want to include the the variables in order to use them in the analysis. But I think this new strategy can deal with that. Thanks for the advice :)