Is there a way to do softmax(matrix), not just softmax(vector)?

I am trying to compute the logit probabilities a hierarchical multinomial regression from a NxK matrix. Ideally, I would like to vectorize without looping thru N. But when I get to the softmax step, it would not take a matrix. Is there a way to handle this?


The softmax function is not defined for matrices or row_vectors for that matter. You would have to do a loop, perhaps over the transpose of a NxK matrix.

Hmm…Is there a way to use the multinomial distribution without the softmax command? I am computing the utilities for each of the alternatives. Then I just apply y[n]~multinomial(softmax(utilities)). Any suggestion would be appreciated.

There is categorical_logit but no multinomial_logit. If you look at the code for the former, it would be easy to adapt to the multinomial case.

Can categorical_logit give vectorized samples in the output? It is not clear from the manual.

Also, how would I adapt it to make multinomial_logit? Can you provide the source code, if I want to create it myself?


I am not sure what you mean, but I assume you are asking whether categorical_logit_lpmf can input an integer array of outcomes over the observations, in which case the answer is yes.

Sorry. This is what I meant to ask:

y~normal(mu, sigma);
// y can be be real or a vector or a row_vector; mu and sigma are of the same type with the same dimensions.

Is the following allowed?

// y is a vector with length N; x is a NxK matrix, where K is the number of alternatives

y can be an integer array (not a vector) of size N but x must be a simplex. There is a PR to do something closer to what you are looking for:

I think he means something like:

int[,] y = softmax(matrix);

Bottleneck is (I think) the toggle of max. calculation and arithmetic calculation in a loop of log_sum_exp's. That’s why I was hoping of GPU support of later then we could speed wise advance with softmax too.

1 Like