# Predicting group membership in mixture model using brms

I was working through the brms mixture model examples provided here and I had some difficulty understanding the interpretation of the parameters predicting group membership. Straight from the example, here is the toy data and model.

``````## simulate some data
set.seed(1234)
dat <- data.frame(
y = c(rnorm(200), rnorm(100, 6)),
x = rnorm(300),
z = sample(0:1, 300, TRUE)
)

## predict the mixing proportions
fit4 <- brm(bf(y ~ x + z, theta2 ~ x),
dat, family = mix, prior = prior,
inits = 0, chains = 2)
summary(fit4)
``````

When I look at `summary(fit4)` this is what I see:

``````Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
mu1_Intercept        0.01      0.11    -0.21     0.24       3348 1.00
mu2_Intercept        5.94      0.13     5.68     6.20       3457 1.00
theta2_Intercept    -0.72      0.18    -1.07    -0.39       4568 1.00
mu1_x                0.06      0.07    -0.08     0.20       3829 1.00
mu1_z               -0.11      0.14    -0.41     0.16       3533 1.00
mu2_x               -0.05      0.10    -0.26     0.15       3952 1.00
mu2_z                0.46      0.18     0.11     0.81       3883 1.00
theta2_z             0.04      0.25    -0.44     0.52       3855 1.00

Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma1     1.04      0.05     0.94     1.15       3861 1.00
sigma2     0.90      0.07     0.78     1.05       3407 1.00
``````

So, now I’m trying to figure out the interpretation of `theta2_Intercept` and `theta2_z`. My best guess is that the model predicts a group membership for Group 2 (i.e., `theta2`) and that if I regress this predicted group membership on `z` I’ll be able to recover the parameters from the model. So, I tried that:

``````## compute the membership probabilities
ppm <- pp_mixture(fit4)

## extract point estimates for each observation

# Get theta2 membership and put it in data frame
dat\$theta2 <- ppm[, 1, 2]

# regress theta2 on x
fit5 <- brm(theta2 ~ z, dat, chains = 2)
summary(fit5)
``````

These are the results for `summary(fit5)`:

``````Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept     0.33      0.04     0.25     0.41       2174 1.00
z             0.01      0.05    -0.10     0.11       2539 1.00

Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     0.47      0.02     0.44     0.51       2131 1.00
``````

As you can see, the results for `summary(fit5)` doesn’t at all recover the parameters predicting `theta2` from `fit4`.

So my question is where did I go wrong and how should I interpret the parameters predicting `theta2` from `fit4`?

This is not a reproducible example. You failed to specify the prior for your model in the example:

``````mix <- mixture(gaussian, gaussian)
prior <- c(
prior(normal(0, 7), Intercept, dpar = mu1),
prior(normal(5, 7), Intercept, dpar = mu2)
)
``````

You’re right. Thanks for the addition.

I’m still looking for help with my problem. If anyone can assist, I’d appreciate it.