Feature request: new type signature for categorical_logit_glm

I specialize in discrete choice models (multinomial regressions, i.e., logit), and I’ve been unable to really take advantage of categorical_logit_glm when the logit of each outcome depends on different variables.

Take the following model:

N \text{ observations} \\ M \text{ possible outcomes} \\ K \text{ predictors}\\ y_i \in {1,...,M} \quad \forall i=1,...,N \\ X_i \in \Re^{M \times K} \\ \alpha \in \Re^M \\ \beta \in \Re^K \\ \Pr(y_i) = \text{softmax}(\alpha + X_i \beta)

This is very standard in economics and social science. If we assume (1) the utility derived from choosing m is equal to u_{im} = x_i'\beta + \varepsilon_{im}, (2) people maximize their utility, and (3) \varepsilon_{im} follow an Extreme Value Type I distribution, then the probability that i chooses m is equal to the m-th element of the softmax of all utilities.

Unless there’s something I’m not understanding from categorical_logit_glm’s function reference, it’s impossible to estimate this model it. Would it be possible to vectorize this setup?

Here’s how a script would look like:

data {
  int<lower=1> N; // n of observations
  int<lower=2> M; // n of alternatives 
  int<lower=1> K; // n of choice covariates
  
  array[N] int<lower=1, upper=M> choices; // choice situation for each observation 
  
  array[N] matrix[M, K] X; // choice covariates for each choice situation 
}
parameters {
  vector[M - 1] alpha_raw; // M-1 free alphas, one alpha is always = 0 
  vector[K] beta; // choice parameters
}
transformed parameters {
  vector[M] alpha = append_row(0, alpha_raw); // required for identifiability
}
model {
  alpha_raw ~ normal(0, 2);
  beta ~ normal(0, 2);
  
  // Now
  // for (i in 1:N) {
  //   target += categorical_logit_lpmf(choices[i] | alpha + X[i] * beta);
  // }
  
  // Potentially much faster!
  target += categorical_logit_glm_lpmf(choices | X, alpha, beta);
}

Thanks!

(Edit: forgot the _lpmf in the final line of code.)

I am not sure that this is the problem. It might just be a typo in your post. Still, there is a typo in the potentially faster code.

It should be either

target += categorial_logit_glm_lupmf(choices | X, alpha, beta)

or

choices ~ categorical_logit_glm(X, alpha, beta)

You’re totally right–although the signature issue persists. Thanks!

One thing that was confusing me is that you use N = # observations and M = # alternatives and the Stan function reference does it the other way around. I think your are right that what you want does not work because X needs to be matrix not an array (of matrices).