Mixed Logit Model

TOM · March 8, 2018, 4:22pm

Hi
Before setting up my own mixed logit model I wanted to ask if such a model already exists. By mixed logit I understand a multi-logit regression (see stan manual page 133) that includes both, alternative specific variables as well as individual variables.
Thanks for any hints,
Tom

martinmodrak · March 10, 2018, 12:37pm

I am not sure I understand what kind of model you mean (I don’t know what you mean by “alternative specific” and “individual” variables). I would however definitely check if your use case isn’t covered by brms. At the very least, you could build a model as close to your aims with brms and then use the code generated by brms as a starting point.

TOM · March 13, 2018, 4:36pm

There seems to be a lot of confusion about terminology, at least for me. By mixed logit I refer to what others call a general model that combines features of a multinomial logit and a conditional logit. A formulation that may help to clarify what I mean is given here (see section 6.3.4).

@martinmodrak I certainly will look into brms.

James_Savage · March 14, 2018, 3:01am

Hi Tom,

Yep – very straightforward. Here’s a gist that describes implementing with only choice-level variables that vary across choices

gist.github.com

https://gist.github.com/khakieconomics/0333a54dff4fabf6204080ca5bf92cb6

hierarchical_bayes_full_covmat.stan

data {
  int N; // number of rows
  int T; // number of inidvidual-choice sets/task combinations
  int I; // number of Individuals
  int P; // number of covariates
  
  vector<lower = 0, upper = 1>[N] choice; // binary indicator for choice
  matrix[N, P] X; // product attributes
  
  int task[T]; // index for tasks

This file has been truncated. show original

If you want to include variables that vary by the individual, then you simply need to include on line 37 those variables multiplied through by their coefficient matrix, as with multinomial logit.

Here’s a slide-deck that might help.

http://rpubs.com/jimsavage/sawtoothcon

TOM · March 14, 2018, 12:19pm

@James_Savage Thanks for the hint. Is there toy data to play around with the model? I’m asking because it is difficult to follow how you subset your data in line 37.

James_Savage · March 14, 2018, 4:33pm

Hi Tom,

Here are a couple of worked examples for you.

http://rpubs.com/jimsavage/using_decisionmaker_variables_in_mixed_logit

If you don’t want to include individual-level parameters on the choice-level attributes you can basically delete the second half of the definition of beta_individual (make sure to remove the tau and L_Omega parameters in that case).

Note I strongly recommend the second version of the model, as it’s more flexible, both in terms of the types of choice patterns it can fit but also the fact that in each task/market/comparison set you can have different possible choices characterised only by their attributes and still allow demographics to influence decision probabilities.

TOM · March 15, 2018, 1:19pm

Hi James

Thanks!

I had a hard time getting my head around your indexing part. So I tried it with a different style of indexing. If I’m not mistaken, it includes individual data (demographics) as well as attributes of the alternatives. My model looks like this:

github.com

ThomasWilli/mixed-logit/blob/master/stan/mnl_cl.stan

data {
	int<lower=2> C;                   // # of alternatives
	int<lower=1> N;	                  // # of individuals
	int<lower=1> K;                   // # of covariates of invdividuals
	int<lower=1> KC;                  // # of covariates of alternatives
	vector<lower=0,upper=1>[C] Z[N];  // # index if alternative was considered 
	matrix[C, K] X[N];                // matrix of attributes for each individual
	matrix[C, KC] X2[N];              // matrix of attributes for each alternatives
	vector<lower = 0, upper = 1>[C] choice[N]; // binary indicator for final choice
}

parameters {
	vector[K] beta;       // individual attributes
	vector[KC] zeta;      // party attributes
}
model {
  matrix[C,N] xb;
  vector[C] utilities[N];
  vector[C] log_prob[N];

This file has been truncated. show original

It seems to me that this model is much more sparse… In other words what am I missing here?

James_Savage · March 15, 2018, 5:06pm

Hi Tom -

First, this model is not identified. As you have it, individual attributes affect the utility of all choices to the same degree (ie simply result in a shift in utility for all choices that does not change their rank ordering). Your beta needs to be a matrix (probably with the P-1 parameterization, which can be achieved by setting the last row as 0).

Second, it assumes that all individuals the same marginal utilities, and so you’ll get the red bus/blue bus problem. That mightn’t be a concern for you.

Third, with logits, normal(0, 5) is an enormously wide prior.

Hope this helps!

jeremy.koster · March 15, 2018, 9:16pm

Hi James,

This is great stuff and potentially very useful in a variety of contexts. One thing that social scientists often encounter with their data is repeated observations of the same individuals. To borrow from your beer example (in the slides), imagine that you have data on the age and salary of the individuals consuming the beer. And then imagine that you see their respective beer choices on approximately five separate occasions, perhaps each characterized by time-varying covariates (amount of stress reported by beer drinker i before making the choice).

What are the chances of expanding the example to account for these variables (presumably with random effects for the repeated observations of individuals)?

James_Savage · March 15, 2018, 9:24pm

Hey Jeremy –

That’s precisely the sort of model I fit everyday! See the second example here, which says that individual characteristics -> preferences over choice attributes, which when combined with choice attributes give us choice probabilities.

http://rpubs.com/jimsavage/using_decisionmaker_variables_in_mixed_logit

jeremy.koster · March 15, 2018, 9:28pm

Oh cool. One slight hurdle, though. I started working through your simulated data script and hit an error in the creation of the indexes data frame:

Error in data_frame(individual = rep(1:I, each = K * 10), task = rep(1:T, :
could not find function “%>%”

Am I overlooking a needed package? Or is there something else about the syntax that I’m overlooking?

James_Savage · March 15, 2018, 9:32pm

Ah – you’ll need to install.packages("dplyr") and library(dplyr) before you run the script.

jeremy.koster · March 15, 2018, 9:35pm

Got it. Thanks.

jeremy.koster · March 15, 2018, 9:54pm

If I’m following along correctly, the attributes of the products vary across tasks (that is, there are time-varying covariates for the products). But the X2 variables for the 50 individuals are assumed to be time-invariant, correct?

How straightforward is to add time-varying covariates for the individuals – perhaps measures of emotional states or variation in monetary endowments at time t or something like that?

James_Savage · March 15, 2018, 10:14pm

Hey Jeremy,

Yep – difficult, but in theory at least possible. I guess you could model each individual-time pair, and presumably have an individual-level hyperprior.

Something like

beta_it ~ normal(beta_i + Gamma * Z_it’, sigma)
beta_i ~ normal(beta, sigma_2)

I’m pretty sure it would work, but might take some time!

jeremy.koster · March 15, 2018, 10:33pm

In their 1989 book, McCullagh and Nelder note the equivalence of conditional multinomial models and Poisson models. Prior to seeing your post, I was looking into the possibility of using a Poisson parameterization. The downside seems to be the need for binary dummy variables corresponding to the choices (equal to the number of people making the choices, that is, which becomes unwieldy with large sample sizes).

Your comment about the challenges of adding time-varying individual-level predictors makes me wonder if the Poisson might have some advantages . . . even though my hunch is that your approach remains preferable. I was also looking into multilevel conditional logit models using gllamm in Stata.

Meanwhile, my initial read on the individual-time effect is that it has the advantage of conservatism, almost like a random coefficient (or random slopes) model. But different disciplines might vary in terms of its necessity.

James_Savage · March 16, 2018, 6:38pm

Hi Jeremy,

It really depends what you are modeling. In Stan at least, I’ve found the Poisson implementation to be a fair bit slower than the multinomial likelihood, but that might just be my coding of it. But that’s not the model–the model is one of individual choice. And you want to choose the model that makes best sense of your data. In most discrete choice problems this will be conditional logit with parameters varying at the level of the individual and the possibility of unobserved demand shocks that correlate with your choice attributes. If you only have aggregate counts data, you can still fit the model as illustrated here:

Cheers,
Jim

jeremy.koster · March 16, 2018, 8:08pm

Cool. And the translation of terminology across disciplines is helpful.

Thanks.

daniel_guhl · March 18, 2018, 2:07pm

Hi Tom,
Once the X matrix is specified correctly, estimating models with alternative- and individual-specific variables is easy. One simple solution that is often applied in the econometrics literature is to use one alternative as a reference (e.g., j = 1). That means, you build for each individual-specific variable J -1 alternative-specific versions (e.g., income_2, …,income_J) that are zero for each alternative other than j.

I’ve added an example using R and the TravelMode data from the AER package. As a comparison a used mlogit. mFormula() makes it very easy to specify a formula with both kinds of variables and then you can use model.matrix() to create X. I hope using R for this example is ok for you. The Stan program is included in the R file and works for multinomial, conditional, and “mixed” logit models, as long as you specific X correctly. Please also have a look at the vignette from mlogit regarding setting up X.

https://gist.github.com/dhguhl/46bd5dabeb5bb1931e81e1b0354cfe3f

PS: I do not like the term “mixed logit” for this type of model because this is also (maybe more often) used for multinomial logit model with individual-specific parameters. However, my econometrics Prof. also always used it ambiguously… very confusing.

PPS: I’m not saying my prior choice is the best, the stan program is perfect, or assuming homogenous preferences is a good idea… just tried to keep the example as simple as possible!

Best,
Daniel

TOM · March 19, 2018, 10:05am

Hi Daniel

Thanks for sharing this. If I understood you correctly, it is indeed very easy to estimate such a model. I guess your approach has the same effect of what @James_Savage does by setting “outside options” to zero (I’m still working on implementing his approach, though). I adapted your example to my data and used R as well (using mlogit.data, mFormula and model.matrix). Additionally, I’ve added a vector, that sets alternatives that are not viable to individual i to zero.

github.com

ThomasWilli/mixed-logit/blob/master/stan/mnl_cl_v2.stan

data {
  int<lower=1> N;                   // # of individuals
  int<lower=1> K;                   // # of predictors
  int<lower=2> J;                   // number of available alternatives
  int<lower=1, upper=J> y[N];       // outcome of chosen alternative
  matrix[N*J, K] x;                 // predictor matrix
  vector<lower=0,upper=1>[J] Z[N];  // # index if alternative was considered 
}
parameters {
  vector[K] beta;
}
model {
  vector[J] utilities[N];
  beta ~ normal(0, 5);

  for (n in 1:N) {
    utilities[n] = x[((n-1)*J+1):(n*J),] * beta;
    utilities[n] = utilities[n] .* Z[n];
    y[n] ~ categorical(softmax(utilities[n]));
  }

This file has been truncated. show original

PS: The terminology is indeed confusing.

Topic		Replies	Views
Mixed logit model with heterogeneity in means (and measured in WTP space) Modeling fitting-issues , specification	0	700	June 28, 2022
Speeding up a hierarchical multinomial logit model Modeling	27	8165	October 18, 2022
Multinomial Logit: probability of choice of a soccer action Modeling pystan , multinomial-response , logistic-regression	24	2225	March 9, 2021
Advice on the Efficiency of Stan Specification Modeling specification	9	994	July 2, 2018
Mixed Logit Model Estimation (Beginner) General	6	2239	September 13, 2018

Mixed Logit Model

Related topics