Vectorizing Multinomial Logit Choice Model for GPU


I am fitting a multinomial logit model with 6,000 individuals with 30 choices each (so 6000 x 30 rows) and it is taking way too much time, not in hours, but in days.

I am following code structure similar to Discrete choice models with RStan under “Estimation of a Mixed Logit Model with RStan” part.

I suspect there are two bottlenecks:

  1. multi_normal_cholesky() that loops over observations (i am running a regression within the model to account for endogeneity)
  2. and computing choice probability looping over buyers

Ideally, i’d like to code in STAN to take advantage of GPU. Is there any way I could do this? Any help would be appreciated!

Are you using reduce-sum as a first step? That should speed things up a bit.

Thank you @tomas.rossetti @adamhi

I will try this soon. In your experience, how much did this trick sped up the computing time?

I haven’t measured the improvement, but it does speed things up a lot. It also obviously depends on the number of cores you have.

Having said that, the mixed logit model is pretty slow in general, so don’t expect it to run in seconds or even minutes. I hope the development team can identify sources of improvement, there’s a bunch of things that could be vectorized. Here’s one idea I posted a while back.

Yes, I was trying to use their “generalized” categorical logit, but I ran into the same problem in the post of yours. I will try to reduce-sum stuff first, and if not, I may have to resort in gibbs+MH hand-coding in MATLAB… Thank you!

I have done this before for work so I cannot exactly share the code. But I can share that you can vectorize the MNL probabilities by writing out the components in LLs.

Specifically, your input data should be in the LONG format and then you can code out: log(UA) + log(UB) + log(UC) - log(exp(UA)+exp(UB)+exp(UC)). UA, UB, UC are the utilities for choice A, B, and C. Furthermore, you can use a log-sum-exp type trick for the denominator. Good luck.