First, you should take @stijn’s advice and use a `multi_normal_cholesky`

. You should also vectorize the call to `multi_normal_cholesky`

—that way your covariance matrix is only factored once.

Also as @stijn points out, if the model’s misspecified for the data (not a coding error, just not a good model for the data), it can be hard for Stan to sample.

You only want to compute `X[r, s] * Beta[ , r]`

once—store it and reuse the result. I don’t think we’ve vectorized categorical logit, but there’s not much shared computation here. Just like @stijn, I’d be worried about those last two linked regressions.

You want to turn Theta * Z into a matrix multiplication, then break into an array to feed into `multi_normal_cholesky`

.

Pinning one of the categoical logit parameters to zero by pinning the relevant coefficient to zero is often done to induce identifiability—the K-value parameterization is only identified through the prior.