Oh, I see! The problem is this: softmax
basically removes one degree of freedom, i.e.: a multinomial with N classes is fully described by N-1 parameters (because the probabilities need to sum to 1). This is actually what the simplex
type does under the hood: simplex[3]
is represented by two parameters on the unconstrained scale. So to make the model work with softmax, you can either fix one of the softmax inputs to 0 (this is the typical way to do multinomial regression) or you need to constraint the parameters to sum to zero (this is a bit more tricky, some discussion at Test: Soft vs Hard sum-to-zero constrain + choosing the right prior for soft constrain
Does that make sense?