# Including group-level predictors in multidimensional IRT model

Hi all,

we have programmed a multidimensional graded IRT which we would like to extent by including group-level predictors. The model relies on the QR-based sum-to-zero constraint outline here for the expectations of the multivariate normal prior on the (person-specific) latent variable Theta. What we would like to add is having the expectations of the multivariate normal prior for Theta depend on person-level predictors, i.e. an additional group-level model, where respondents denote groups.

Below is the Stan program without the group-level model. This works fine.

``````functions {
// Sum-to-zero constraint based on QR decomposition
// source: https://discourse.mc-stan.org/t/test-soft-vs-hard-sum-to-zero-constrain-choosing-the-right-prior-for-soft-constrain/3884/31
vector Q_sum_to_zero_QR(int N) {
vector [2*N] Q_r;

for(i in 1:N) {
Q_r[i] = -sqrt((N-i)/(N-i+1.0));
Q_r[i+N] = inv_sqrt((N-i) * (N-i+1));
}
return Q_r;
}

vector sum_to_zero_QR(vector x_raw, vector Q_r) {
int N = num_elements(x_raw) + 1;
vector [N] x;
real x_aux = 0;

for(i in 1:N-1){
x[i] = x_aux + x_raw[i] * Q_r[i];
x_aux = x_aux + x_raw[i] * Q_r[i+N];
}
x[N] = x_aux;
return x;
}
}

data {
int<lower=1> I;                     // Number of items
int<lower=1> J;                     // Number of respondents
int<lower=1> L;                     // Number of items by country
int<lower=1> N;                     // Total responses
int<lower=1> D;                     // Dimensions
int<lower=1> K;                     // Number of respondent-level predictors
array[N] int<lower=1, upper=I> ii;  // Item ID
array[N] int<lower=1, upper=J> jj;  // Respondent ID
array[N] int<lower=1, upper=L> ll;  // Item by country ID
array[N] int<lower=1, upper=D> dd;  // Dimension ID
array[N] int<lower=0> y;            // Responses
}

transformed data {
array[N] int r;               // Modified response; r = 1, 2, ... m_i + 1
array[I] int m;               // Number of item-specific intercepts (difficulty)
array[L] int p;               // Number of item by country-specific intercepts (item bias/nonequivalence)
array[I] int pos_alpha;       // First position in item intercepts
array[I] int pos_delta_alpha; // First position in between step change for item intercepts
array[L] int pos_zeta;        // First position in item by country intercepts
array[L] int pos_delta_zeta;  // First position in between step change for item by country intercepts
vector[2*J] Q_r;
real mu_raw_sigma;

m = rep_array(0, I);
p = rep_array(0, L);

for (n in 1 : N) {
r[n] = y[n] + 1;
if (y[n] > m[ii[n]])
m[ii[n]] = y[n];
}

for (n in 1 : N) {
if (y[n] > p[ll[n]])
p[ll[n]] = y[n];
}

pos_alpha[1] = 1;
pos_delta_alpha[1] = 1;
for (i in 2 : (I)) {
pos_alpha[i] = pos_alpha[i - 1] + m[i - 1];
pos_delta_alpha[i] = pos_delta_alpha[i - 1] + m[i - 1] - 1;
}

pos_zeta[1] = 1;
pos_delta_zeta[1] = 1;
for (i in 2 : (L)) {
pos_zeta[i] = pos_zeta[i - 1] + p[i - 1];
pos_delta_zeta[i] = pos_delta_zeta[i - 1] + p[i - 1] - 1;
}

// QR Decomposition
Q_r = Q_sum_to_zero_QR(J); // For sum-to-zero constraint
mu_raw_sigma = inv_sqrt(1 - inv(J));

}

parameters {
vector[I] eta;                               // First intercept parameter for each item
vector[L] kappa;                             // First intercept parameter for each item by country
vector<lower=0>[sum(m) - I] delta_alpha;     // Between-step changes in item intercepts
vector<lower=0>[sum(p) - L] delta_zeta;      // Between-step changes in item by country intercepts
array[I] real<lower=0> beta;                 // Item slopes (discrimination)
cholesky_factor_corr[D] L_Omega;             // Cholesky factor of correlation matrix (off-diagonal elements)
vector<lower=0, upper=pi()/2>[D] tau_unif;   // Scale factor (diagonal elements)
matrix[D, J] Theta_raw;                      // Multivariate normal
matrix[J-1, D] Mu_raw;                       // Sum-to-zero
real<lower=0> sigma_eta;
real<lower=0> sigma_kappa;
real<lower=0> sigma_delta_alpha;
real<lower=0> sigma_delta_zeta;
real<lower=0> sigma_beta;
}

transformed parameters {
vector[sum(m)] alpha;     // Composite item intercepts
vector[sum(p)] zeta;      // Composite item by country intercepts
vector<lower=0>[D] tau;   // Reparameterized scale factor
matrix[D, D] L_Sigma;     // Cholesky factor of covariance matrix
matrix[D, J] Theta;       // Multidimensional latent variable (ability)
matrix[J, D] Mu;          // Expectations

for (i in 1 : I) {
if (m[i] > 0)
alpha[pos_alpha[i]] = eta[i];
for (k in 2 : m[i])
alpha[pos_alpha[i] + k - 1] = alpha[pos_alpha[i] + k - 2]
+ delta_alpha[pos_delta_alpha[i] + k - 2];
}

for (i in 1 : L) {
if (p[i] > 0)
zeta[pos_zeta[i]] = kappa[i];
for (k in 2 : p[i])
zeta[pos_zeta[i] + k - 1] = zeta[pos_zeta[i] + k - 2]
+ delta_zeta[pos_delta_zeta[i] + k - 2];
}

for (d in 1:D) {
Mu[,d] = sum_to_zero_QR(Mu_raw[,d], Q_r); // Sum-to-zero constraint
}

tau = 2.5 * tan(tau_unif);                 // Prior on scale factor
L_Sigma = diag_pre_multiply(tau, L_Omega); // Decomposed cholesky factor of covariance matrix (scale factor and cholesky factor of correlation matrix)
Theta = Mu' + L_Sigma * Theta_raw;   // Multivariate respondent-level model (non-centered parameterization)
}

model {
vector[D] theta_j;

// Priors
eta ~ normal(0, sigma_eta);
kappa ~ normal(0, sigma_kappa);
delta_alpha ~ normal(0, sigma_delta_alpha);
delta_zeta ~ normal(0, sigma_delta_zeta);
beta ~ lognormal(1, sigma_beta);
L_Omega ~ lkj_corr_cholesky(2);
for (d in 1:D) {
Mu_raw[,d] ~ normal(0, mu_raw_sigma);
}
to_vector(Theta_raw) ~ std_normal();
sigma_eta ~ normal(0, 1);
sigma_kappa ~ normal(0, 1);
sigma_delta_alpha ~ normal(0, 1);
sigma_delta_zeta ~ normal(0, 1);
sigma_beta ~ normal(0, 1);

// Likelihood
for (n in 1:N) {
theta_j = Theta[, jj[n]];

r[n] ~ ordered_logistic(theta_j[dd[n]] * beta[ii[n]],
segment(alpha, pos_alpha[ii[n]], m[ii[n]])
+ segment(zeta, pos_zeta[ll[n]], p[ll[n]]));
}
}
``````

Our initial thought was to include the group-level model by adding the following components to the program:

``````data {
matrix[J, K] Z;                     // Design matrix of respondent-level covariates
// ...
}

parameter {
matrix[K,D] Gamma;                        // Regression coefficients
// ...
}

model {
for (d in 1:D) {
Mu_raw[,d] ~ normal(Z * Gamma[,d], mu_raw_sigma);
}
// ...
}
``````

However, this does not work as Mu_raw has dimensions [J-1,D] and Z * Gamma has dimensions [J, D]. So we tried the following instead.

``````transformed parameters {
for (d in 1:D) {
Mu[,d] = sum_to_zero_QR(Mu_raw[,d], Q_r) + Z * Gamma[,d]; // Sum-to-zero constraint and regression
}
// ...
}
``````

But here the coefficients Gamma all go to zero. We also tried putting the sum-to-zero-constraint directly on Gamma, omitting Mu and have:

``````transformed parameters {
for (d in 1:D) {
Gamma[,d] = sum_to_zero_QR(Gamma_raw[,d], Q_r_gamma);
}
// ...
Theta = (Z * Gamma)' + L_Sigma * Theta_raw;
// ...
}
``````

But then neither Theta nor Gamma converge.

We are grateful for any suggestions on how to add the group-level predictors here.

Learned some things in the meantime:

• restricting tau, the diagonal of the covariance matrix, to 1, makes the model run smoother and faster. Together with the sum-to-zero constraint, this is also recommended for identification here.

• the sum-to-zero constraint is strictly required as the measurement model is not identified otherwise. Whether the QR-based version or a soft constraint is being used makes no difference in this case.

• Except for Mu and Gamma, all parameters look good. All regression coefficients Gamma go to zero (which they shouldn’t in our case).

• For some reason, all elements of Mu are also zero, while only their sum should be zero. So, I suspect that this is where the problem is located. I tried wider priors for Gamma and Mu without success.

Appreciate any suggestions as to why all the expectations Mu of the multivariate normal prior for Theta go to zero, rather than just their sum.