Hello!
I am trying to write (and use) a model to estimate a latent class multinomial logit model for data obtained through conjoint experiments.
No existing models work quite the way that I want them to. At the end of the day I would like:
- A set of utilities for each latent class
- Membership probabilities for each individual for each class
- [Optionally] Coefficients on each individual-level variable that impact membership probabilities
I have attempted to write this model which is written below. It is certainly does not represent best practices — it doesn’t include proper priors on most of the parameters, for example.
The primary issue with this model is that NUTS sampling (calling rstan::sampling
) takes prohibitively long. I believe I waited about an hour and not even 200 samples were obtained with my dataset. I am able to get the model to “work” using optimizing
.
What I am looking for is help with the following:
- Helping to edit the model specification so that it can sample efficiently
- Helping to place priors on the parameters so that it can (a) sample efficiently and (b) give more reasonable results. If it is possible to promote sparsity on the class membership that would be a bonus.
Below the code, I’ve included a very simple R script to generate some dummy data to feed into the model.
Citation: I adapted most of the code below from Elea Feit’s hierarchical multinomial logit model — repository here
// latent class mnl model conjoint experiments
data {
int<lower=2> C; // number of alternatives (choices) in each scenario
int<lower=1> A; // number of attributes for each alternative
int<lower=1> R; // number of respondents
int<lower=2> K; // number of latent classes
int<lower=1> S; // number of choice tasks per respondent
int<lower=0> G; // number of respondent covariates
int<lower=1,upper=C> Y[R, S]; // observed choices
matrix[C, A] X[R, S]; // matrix of attributes for each choice task (design matrix)
matrix[G, R] Z; // vector of covariates for each respondent
}
parameters {
matrix[A, K] Beta; // class level utilities
matrix[K-1, R] Eta; // class membership values
matrix[K-1, G] Delta; // class prediction covariates
real<lower=0> sigma_class;
}
model {
matrix[K, R] memb_probs; // record the probability of each class for each respondent
// for each person in the database, we
for(r in 1:R) {
// compute their class membership probability
// to make this make sense, we have this intermediate value called "eta"
// which can be anything, and then we use the softmax formula to actually
// get to the class membership
// the reason that its 1 to K-1 is that the numbers have to sum to 1
// so we let the values for 2+ be whatever but set the first one to 1
for(k in 1:(K-1)) {
Eta[k, r] ~ normal(dot_product(Z[, r], Delta[k, ]), sigma_class);
}
// the first class is a special case since its eta = 1
memb_probs[1, r] = 1/(1 + sum(exp(Eta[, r])));
// the rest of the classes have etas that are whatever
// and we compute the result of the softmax to give us their membership probs
for(k in 1:(K-1)) {
memb_probs[k+1, r] = exp(Eta[k, r]) / (1 + sum(exp(Eta[, r])));
}
// for each class
for(k in 1:K) {
// for each potential outcome
// we compute the likelihood of making the choice that they made
// weighted by the probability of being in that class
for(s in 1:S){
target += memb_probs[k, r] * categorical_logit_lpmf(Y[r,s] | X[r,s] * Beta[,k]);
}
}
}
}
And some R script to generate some dummy data and feed into the Stan
model. NB: on random data optimizing
fails very early, presumably because there is nothing to optimize. With real (which I, unfortunately, cannot share) data it actually does optimize but not very well.
library(MASS)
library(tidyverse)
library(rstan)
generate_lc_mnl_data <- function(
R = 100, # number of respondens
S = 12, # number of tasks/respondent
C = 3, # number of concepts/task
A = 30, # number of choice covariates (nattrs * nlevels)
K = 4, # number of classes,
G = 2 # number of respondent covariates
) {
Y <- array(sample(1:C, R*S, replace = T), dim=c(R, S))
X <- array(rnorm(R*S*C*A), dim=c(R, S, C, A))
Z <- array(rnorm(G*R), dim=c(G, R))
return(list(C=C, A=A, R=R, K=K, S=S, G=G, Y=Y, X=X, Z=Z))
}
standata <- generate_lc_mnl_data()
stanmodel <- stan_model("inst/models/lc-mnl.stan")
modelfit <- sampling(
stanmodel,
data = standata,
verbose = T
)