Mixture model for person-level effects

shira · September 30, 2019, 8:47pm

Our data are longitudinal, with multiple observations per person i over time t.

First, we fit a standard multilevel logistic model:
E[y_{it} | x_{it}, \eta_i] = \text{logit}^{-1}(\beta_0 + \beta_1 x_{1,it} + \beta_2 x_{2,it} + \beta_3 x_{3,it} + \eta_i)
\eta_i \sim N(0,\sigma_\eta)

This model appeared to converge (no obvious fitting issues). But our posterior predictive check failed. We got advice to try a mixture model for \eta_i:

z_i \sim \text{Categorical}(\theta_1,\theta_2)
\eta_i | z_i \sim N(\mu_{z_i},\sigma_{z_i})

where we enforce:
E[\eta_i] = \theta_1\mu_1 + \theta_2\mu_2 = 0

However, the model (as coded below) does not converge (we used 4 chains, 2000 iterations). We cannot share the data, but will try simulated data next. Any advice just based on the code? Errors?

Thanks so much!

data {
  int<lower=1> num_obs;
  int<lower=1> num_people;
  int y[num_obs];
  int<lower=1,upper=num_people> ii[num_obs]; // person
  real x1[num_obs];
  real x2[num_obs]; 
  real x3[num_obs];
}
parameters {
  real person_effect[num_people];
  simplex[2] theta;          // mixing proportions
  real mu_subset;      // first location of mixture components
  vector<lower=0>[2] sigma;  // scales of mixture components
  real intercept; 
  real beta_1; 
  real beta_2; 
  real beta_3; 
}
model {
  vector[2] log_theta = log(theta);  // cache log calculation
  vector[2] mu;      // locations of mixture components
  real y_mean_logit[num_obs]; 
  mu[1] = mu_subset;
  mu[2] = (- theta[1] * mu[1])/theta[2];
  intercept~normal(0,4);
  beta_1~normal(0,4);
  beta_2~normal(0,4);
  beta_3~normal(0,4);
  mu~normal(0,4);
  sigma~normal(0,4);
  for (i in 1:num_people) {
    vector[2] lps = log_theta;
    for (k in 1:2)
      lps[k] += normal_lpdf(person_effect[i] | mu[k], sigma[k]);
    target += log_sum_exp(lps);
  }
  for(n in 1:num_obs)
    y_mean_logit[n] = intercept+beta_1*x1[n]+beta_2*x2[n]+beta_3*x3[n]+person_effect[ii[n]];
  y ~ bernoulli_logit(y_mean_logit);
}

bbbales2 · October 1, 2019, 9:39pm

Having the non-centered parameterization on person_effect is probably pretty important. I’m assuming that was non-centered in the previous code? If not then I guess I’m wrong haha.

Anyway I don’t know how to do something like that with a mixture. I’m assuming that’s where the problem is coming from.

shira · October 2, 2019, 1:15am

Yup, you are correct, we used non-centered parameterization with the previous (non-mixture, single component) model. But I also don’t know what parameterization makes sense here !

stijn · October 2, 2019, 1:33am

What is the exact reason to introduce the mixture model? I wonder whether there are no better ways to introduce whatever the mixture model is trying to capture. They are hard to estimate/make identifiable. Maybe you can get away with fixing the standard deviations in the components or having a mixture of “logit precision parameters”.

shira · October 8, 2019, 4:29pm

great question !

The person effects \eta_i in the standard multilevel logistic model are not even close to Normally distributed. There appear to be at least two components, so it is reasonable to consider a mixture model. However, as you point out, they are hard to estimate.

what do you mean by “a mixture of “logit precision parameters”"?

thanks!!

stijn · October 9, 2019, 12:54am

If you see empirically that there are two subsamples than maybe your initial approach with fixed standard deviations might be the best next step.
In behavioral economic type models, multiplying the whole linear part with a parameter is quite popular. If you let the parameter vary by individual, it can capture the idea that people some people are more decisive in their decisions. That’s why I asked for the reason to expand the model.

Sorry, I am in a hurry. If things are not clear, just respond here.

Stephen_Martin · October 15, 2019, 10:02pm

You probably need to impose a constraint to identify the mixture. E.g., require that mu_subset is negative: real<upper=0> mu_subset.
Otherwise, it can oscillate between or within-chain between mu[1] >0, mu[2] < 0; and mu[1] < 0, mu[2] > 0. There’s no difference whatsoever to the likelihood, or to the posterior, so you get a multimodal, aliased posterior. You have to break that apart using priors (which never worked for me), hard constraints (usually works for me, if data can separate it enough), or ordered simplex constraints (which never works for me, and is excruciatingly slow).

Topic		Replies	Views
Can anybody help with multi-level mixture models Modeling	2	131	September 2, 2024
Growth Mixture Modeling (Latent Class Linear Mixed Model) Modeling	2	2024	February 14, 2020
Multilevel Gaussian mixture regression model Modeling	1	756	March 22, 2019
Strategies for Improving (Growth) Mixture Model Convergence Modeling	1	905	November 11, 2018
Individual level estimates for mixture models (zero & one inflated beta) brms mixture , cognitive-science	1	639	November 9, 2020

Mixture model for person-level effects

Related topics