First, you should take @stijn’s advice and use a
multi_normal_cholesky. You should also vectorize the call to
multi_normal_cholesky—that way your covariance matrix is only factored once.
Also as @stijn points out, if the model’s misspecified for the data (not a coding error, just not a good model for the data), it can be hard for Stan to sample.
You only want to compute
X[r, s] * Beta[ , r] once—store it and reuse the result. I don’t think we’ve vectorized categorical logit, but there’s not much shared computation here. Just like @stijn, I’d be worried about those last two linked regressions.
You want to turn Theta * Z into a matrix multiplication, then break into an array to feed into
Pinning one of the categoical logit parameters to zero by pinning the relevant coefficient to zero is often done to induce identifiability—the K-value parameterization is only identified through the prior.