Mixed membership model - label switching?

I’m attempting to fit a mixed membership model a la Erosheva, Feinberg, and Lafferty (2004). In the model, I persons answer J items and each item has Lj response levels (i.e., not all items must have the same number of response levels). Each person belongs probabilistically to one of K response profiles and each profile has as distribution of possible responses to each item.

I borrowed quite a bit from the Stan manual to set up marginalization and avoid data augmentation. I also hacked a ragged array into a vector of values and one of indices, (cs and Slj), but that technique has worked for me in the past.

Code:

data {
  int<lower=1> K; // profiles
  int<lower=1> I; // persons
  int<lower=1> J; // items
  int<lower=2> L[J]; // response levels per item
  int<lower=2> Slj; // sum of L, J; total number of response levels across aSlj items
  int<lower=1> cs[J + 1]; // cumulative sum of levels per item, with '1' added at index 1.
  int<lower=0> y[I,J]; // responses
}
transformed data {
  real<lower=0> pg;

  pg = 1.0 / K;
}
parameters {
  positive_ordered[K] eta1;
  vector<lower=0>[K] eta[I - 1];
  vector<lower=0>[Slj] gamma[K];
}
transformed parameters {
  vector[K] log_q_z[I];
  vector<lower=0>[Slj] beta[K];
  vector<lower=0>[K] theta[I];

  theta[1] = eta1 / sum(eta1);
  for (i in 2:I) {
    theta[i] = eta[i - 1] / sum(eta[i - 1]);
  }
  for (k in 1:K) {
    for (j in 1:J) {
      beta[k, cs[j]:cs[j + 1]] = gamma[k, cs[j]:cs[j + 1]] / sum(gamma[k, cs[j]:cs[j + 1]]);
    }
  }

  for (i in 1:I) {
    log_q_z[i] = log(theta[i]);
    for (j in 1:J) {
      for (k in 1:K) {
        log_q_z[i,k] = log_q_z[i,k] + log(beta[k,cs[j] + y[i,j]]);
      }
    }
  }
}
model {
  for (i in 1:I) {
    if (i == 1) {
      target += exponential_lpdf(eta1 | pg);
      } else {
      target += exponential_lpdf(eta[i - 1] | pg);
    }
    target += log_sum_exp(log_q_z[i]);
  }
  
  for (k in 1:K) {
    target += exponential_lpdf(gamma[k] | pg);
  }
}

I believe the problem I’m having is label-switching – I consistently get reasnoably N_eff and Rhat for all parameters in up to K-2 of the possible profiles, but then terrible mixing for at least two of the remaining profiles.

I attempted to constrain the estimates a bit. First, I made the probabilities for K at i = 1 ordered, hoping that the ordering the profiles relative to the probability of i = 1’s membership in them would help. I also attempted initializing profile membership by first running K-means on the responses and setting the first value on each subject i’s membership in a category based on that. I also tried refactoring to only cover cases where all items have the same number of responses, which made a lot of the model simpler, but had the same problem.

Have I misspecified the model? Are there other tricks I’m missing?

Code:
mm.stan (1.3 KB)

Simulated Data and R script:

Thanks.

Not that I know of. These are just some fundamental problems in these kinds of models. You are missing some vectorization possibilities, like that this

for (k in 1:K) {
    target += exponential_lpdf(gamma[k] | pg);
  }

can just be

target += expondnetial_lpdf(gamma | pg);

or more efficiently to avoid constants,

gamma ~ exponential(pg);

The best thign I can recommend is Michael’s case study on mixtures:

http://mc-stan.org/users/documentation/case-studies/identifying_mixture_models.html

There’s also some discussion in the problematic posteriors chapter of the manual and in the mixture sections.

Thanks, Bob! I suspected this was a deeper problem.

+1 for the case study on mixtures, though, that’s a great resource.