Yes, it's the stick-breaking process in the model, but I implemented it in a centered way so that (0, 0, ..., 0) on the unconstrained scale leads to a symmetric simplex (i.e., representing a uniform distribution).
All of our transforms are down to the natural constrained scales. It would be possible to provide a log_simplex data structure either as a built-in or as a manual addition.
Yes, the ordering can matter when you have K in the thousands. There was a thread about this on discourse but I can't find it.
You can work with completely unconstrained values (log odds) directly through something like our
categorical_logit distribution. Why would you want to go to just the log scale?
Splitting in a binary rather than linear fashion sounds like it could be promising if the Jacobians are more stable. They could still be arranged so that everything's lower-triangular and the determinant remains tractable. Usually underflow's not an issue, though, if that's what you're worrying about.
It would be easy enough to try. The code for the transforms is all very modular in the implementations (look for