Growth Mixture Modeling (Latent Class Linear Mixed Model)

kfeder · August 10, 2018, 9:52pm

I’m trying to fit what I would call a “growth mixture model” (which I think is sometimes called a “latent class linear mixed model”). A binary outcome is measured at multiple timepoints for multiple participants. Each participant contributes a different number of observations, and the observations are not evenly spaced. The goal is to model trends in the probability of experiencing the binary outcome over time. We believe that there are a finite number of common trajectory classes, but we don’t know which participants fall in which class. So we need to combine mixture modeling (to handle the latent classes) with multilevel modeling (to handle repeated observations on the same participants).

I will try to describe the model likelihood formally: Let i index the participant and j index the observation. Let y_{ij} be the binary outcome and X the covariate matrix (here, the only covariates are polynomials of time). Let k be a categorical variable indicating membership in a particular latent class, and p be a simplex vector of latent class proportions. Let \beta_{ik} be the vector of regression coefficients for participant i if s/he is in class k. Finally, let \gamma_k and \sigma_k be the mean vector and covariance matrix of the multivariate normal from which participants’ \beta_{ik} are drawn if they are in class k. Then:

y_{ij}|k \sim Bernoulli(\mu_{ijk})
logit(\mu_{ijk}) = X_{ij} \beta_{ik}
\beta_{ik} \sim Gaussian(\gamma_{k},\sigma^2_{k})
k \sim Categorical(p)

Now here is some Stan code attempting to fit the model.

data {
  int<lower=0> I; // number of observations
  int<lower=0> J; // number of participants
  int<lower=1> K; // number of mixture components
  int<lower = 1,upper = J> id[I]; // id vector that maps observations to participants
  int<lower=0,upper=1> use[I]; // outcome
  matrix[I,4] model_matrix; // time covariates
}
parameters {
  simplex[K] lambda; // weight for each component of mixture
  matrix[J,4] beta[K]; // each participants’ coefficients, conditional on mixture
  vector[4] beta_mean[K]; // grand mean of coefficients, conditional on mixture
  cov_matrix[4]<lower=0> beta_variance[K]; // variance of coefficients, conditional on mixture
}
model {
  real lps[K];
  for(i in 1:I) { // loop over participants
    for(k in 1:K) { // loop over mixture components
      las[k] = log(lambda[k]);
      lps[k] += bernoulli_logit_lpmf(use[i]|model_matrix[i]*beta[k][id[i]]) +
        normal_lpdf(beta[k][id[i]]|beta_mean,beta_variance);
      // add log probability of mixture
    }
    target += log_sum_exp(lps); // update log probability of model by combining both mixture components
  }
}

My questions:

Does this actually fit the model I described? I can’t quite figure out how to loop over the participants, or correctly marginalize out the latent classes.
Is there a way to make this code more efficient?
Are there smart, non-informative prior choices? (Right now, I think the code gives everything a flat prior by default?)

This is my first Stan project, so thanks in advance for your guidance and your patience.

(P.S. If anyone knows any Stan users at Johns Hopkins, please let me know.)

Bob_Carpenter · August 22, 2018, 10:19am

I was having trouble connecting the text description to the generative model. I’m pretty sure each participant i gets a class k_i and each participant only gets one \beta_i (rather than one for each k). So I think the generative story should start with priors on the parameters:

p \sim \ ?

\gamma_k \sim \ ?

\sigma_k \sim \ ?

Then move onto generating the latent classes and coefficients

k_i \sim \mathsf{Categorical}(p)

\beta_i \sim \mathsf{Normal}(\gamma_{k[i]}, \sigma_{k[i]})

y_{i, j} \sim \mathsf{BernoulliLogit}(X_{i, j} \cdot \beta_{i})

I’m using Stan’s normal with scale (standard deviation) parameterization and the built-in Bernoulli on the log-odds scale.

Was that your intention?

If so, the normal for beta needs to be pulled out and then indexed with beta[id[i]].

Duncan · February 14, 2020, 4:06pm

was this resolved? I am interested these types of models but can’t follow that code.

Topic		Replies	Views
Mixture modelling with a reference class (i.e. K-1 latent binary logistic regressions) Modeling	8	1710	August 30, 2017
Strategies for Improving (Growth) Mixture Model Convergence Modeling	1	906	November 11, 2018
Identification issues with Latent Class Analysis/Mixture model Modeling fitting-issues , specification , mixture	25	3664	October 24, 2022
Help writing latent class multinomial logit model (for conjoint data) Modeling fitting-issues , specification , performance	21	3331	October 16, 2020
Mixture Bernoulli Modeling	5	1531	May 9, 2020

Growth Mixture Modeling (Latent Class Linear Mixed Model)

Related topics