From what I understand, Stan’s softmax is just

```
softmax <- function(x) {
exp(x)/sum(exp(x))
}
```

But Koster and McElreath in

Koster, J., & McElreath, R. (2017). Multinomial analysis of behavior: statistical methods. *Behavioral Ecology and Sociobiology* , *71* (9), 1-14.

Use an alternative (that forces maximum of x to 0) in a personalized link function

```
softmax2 <- function(x) {
x <- max(x) - x
exp(-x)/sum(exp(-x))
}
```

despite drawing samples with Stan’s

```
categorical_logit( x )
```

I understand, why the normalization preventing extremely large numbers exp(integer>1) might be favourable, but they achieve different contrafactuals from the posterior samples when using an alternative function. Is there something I am missing?