Hi,

I am trying to fit a categorical logit model with random slopes and intercepts. The model will be applied to a few different data sets with similar predictors X, number of observations N, and number of outcome categories K\in \{4, 7, 8\}. The data is nested with n observations split across individual donors i located within different industries j. The number of observations per individuals varies. Ideally, I would account for variation across both grouping variables, but am starting simple by only focusing on industries as my quantity of interest is the difference in effect size of a single factor variable across industries.

So far, I’ve got a baseline model with just random intercepts based off of @rtrangucci’s model here. The model works fine on simulated data and returns similar results as a brms, but is far more efficient and doesn’t degrade as I scale up the complexity. I want to extend it so that a subset s of the individual-level \beta's also have group-level random slopes \beta_J^s.

I am guessing I should utilize a multivariate parameterization and Cholesky factorization. Implementing this for a `bernoulli_logit`

is clear enough from the user’s manual, but I am not sure how to scale it up to K outcomes. In particular, object choice/structure is stumping me when it comes to the (transformed) parameters.

Here’s what I have got so far:

```
data {
int<lower=2> K; // number of DV categories
int<lower=0> N; // number of observations
int<lower=1> D; // number of individual predictors
matrix[N, D] x; // individual predictors
int<lower=1> J_ind; // number of industries
int<lower=1> P; //number of group predictors
int<lower=1, upper=J_ind> idx_ind[N]; // industry for individual
matrix[J_ind, P] u; // group-level predictors
int<lower=1> G; // number of grouping variables
int<lower=1, upper=K> y[N]; // outcome
}
transformed data {
row_vector[D] zeros_pop = rep_row_vector(0, D); // zeros to append to pop-level coefs
row_vector[P] zeros_grp = rep_row_vector(0, P); // zeros to append to grp-level coefs
row_vector[J_ind] zeros_ind = rep_row_vector(0, J_ind); // zeros for grp-level intercepts
matrix[D, N] t_X = x'; // x transpose
matrix[P, J_ind] = u'; // u transpose
}
parameters {
matrix[K - 1, D] beta_raw; // betas for K - 1 outcomes
vector[K - 1] alpha_raw; // intercept for K - 1 outcomes
matrix[K - 1, J_ind] eta_ind; // matrix for (K-1) * J_ind intercept coefficients
vector<lower=0>[K - 1] sigma_ind; // ind variances
vector<lower=0>[G] sigma_inter_eqn; // inter-group (ind) variance
}
transformed parameters {
matrix[K, D] beta;
matrix[K, P] theta;
matrix[K, J_ind] alpha_ind;
vector[K] alpha;
// fill in constrained parameters with one index/row set to zero
for (k in 1:(K - 1)) {
alpha[k + 1] = alpha_raw[k];
alpha_ind[k + 1] = sigma_inter_eqn[1] * sigma_ind[k] * eta_ind[k];
}
beta = append_row(zeros_pop, beta_raw);
alpha[1] = 0;
alpha_ind[1] = zeros_ind;
}
model {
// prior on locations
to_vector(beta_raw) ~ normal(0, 1);
to_vector(eta_ind) ~ normal(0, 5);
alpha_raw ~ normal(0, 5);
// prior on variance
sigma_ind ~ normal(0, 5);
sigma_inter_eqn ~ normal(0, 1);
// likelihood
{
matrix[K, N] mu_logit;
mu_logit = beta * t_X;
for (n in 1:N)
y[n] ~ categorical_logit(
alpha + col(mu_logit, n) + col(alpha_ind, idx_ind[n])
);
}
}
```

I am probably over-complicating things. Any help that will set me in the right direction is greatly appreciated.