Prior with strong structure on square blocks of rectangular matrix

I am training a multilogit model where the beta ends up being structured as a bunch of stacked square matrices, where I have a prior for the diagonals and a separate prior for the off terms.
e.g. my prior on each element of beta would be
(0, uniform(-\infty, infty), uniform(-\infty, infty),
0, N(c_mu, c_sigma), N(0, 5),
0, N(0, 5), N(c_mu, c_sigma),
0, N(o_mu, o_sigma), N(0,4),
0, N(0, 4), N(o_mu, o_sigma)).

Let’s say these squares were K x K instead of 2x2 as they are here. Is there a better way to declare my priors (in terms of helping stan compute the gradient faster) in the model than to loop through beta and use if statements to set the appropriate prior?

data {
int N;
int K; //k categories
int D; //D predictors
int y[N];
matrix[N, D] x;
}
transformed data {
vector[D] zeros = rep_vector(0, D);
}
parameters {
real c_mu;
real<lower=0> c_sigma;
real o_mu;
real<lower=0> o_sigma;
matrix[D, K-1] beta_raw;
}
transformed parameters {
matrix[D, K] beta;
beta = append_col(zeros, beta_raw);
}
model {
matrix[N, K] x_beta = x * beta;
//for clarity i wrote out every element, but my current way is to loop through each one and write branching logic to assign the prior.
beta[2,2] ~ normal(c_mu, c_sigma);
beta[2,3] ~ normal(0, 5);
beta[3,2] ~ normal(0, 5);
beta[3,3] ~ normal(c_mu, c_sigma);
beta[4,2] ~ normal(o_mu, o_sigma);
beta[4,3] ~ normal(0, 4);
beta[5,2] ~ normal(0, 4);
beta[5,3] ~ normal(o_mu, o_sigma);
for (n in 1:N)
y[n] ~ categorical_logit(x_beta[n]’);
}

EDIT:
Relatedly, if I have a square block in my matrix with the same prior for each element, e.g.
(N(0,5),N(0,5),N(0,5),
N(0,5),N(0,5),N(0,5)
N(0,5),N(0,5),N(0,5))
is there a way to vectorize that declaration? Something like beta[10:13][1:3] ~ N(0,5)

Bump - would really appreciate any help with this! I read the efficiency section of the stan user guide and it said to loop through columns instead of rows, but that made my code 20% slower…

You can do to_vector(beta[10:13,1:3]) ~ normal(0,5);. On the other hand, prior computation is very unlikely to be a bottleneck in a model. Unless you’ve run the model multiple times, for a problematic model, a 20% increase in time could easily be random fluctuation due to random chain initialization.

Instead, I would expect there is a conflict between your model and the data or some other problem with the model that is causing the slow computation (or maybe your model is just big - not sure what is your D and K)

P.S. Note that you can use backticks (`) and triple backticks for inline code and code blocks respectively to make your posts easier to read :-) Also using dollar signs will let you write math in Latex syntax.