Fitting a multi-group Rasch Model (IRT)

Dear Jill,

For debugging purposes I would suggest you reduce the amount of data you simulate and also possibly run it on a single chain. This makes it way easier to determine errors in the model code.

The most obvious issue with your multi-group model specification is the indexing of beta_raw and beta. If you swap the indexing from vector[K] beta[I] to vector[I] beta[K] you will define the intended array of length K containing a vector of length I. These group-specific item difficulty parameter vectors can then be accessed by beta[k].

Fitting your model with this change will run, however there are still divergent transitions.

In your model code you also write

mu[1] ~ normal(0, 1);            // mu group 1
mu[2] ~ normal(1, 1);            // mu group 2; to avoid label switching problem

I think this non-exchangeable prior on mu is not required here. Label switching generally refers to an issue in mixture models, which is not the case for your model as the grouping variable kk is observed. In my opinion just specifying mu ~ normal(0, 1) should be fine.

For IRT models I have made good experiences using the following optimizations:

  • non-centered parameterizations for the theta parameters
  • dropping the sum-to-zero constraint of the beta parameters and instead using the distributional assumption beta ~ normal(0, 1). The issue with sum-to-zero constraints is that a prior distribution on the unconstrained item parameters will result in a non-exchangeable prior distribution for all item parameters. There is a thread on discourse that goes into great detail about soft vs. hard constraints, how they relate to sampling efficiency and how they interact with prior distributions.

Implementing the changes described above results in this model:

data {
  int<lower=1> I;                // number of items
  int<lower=1> J;                // number of students
  int<lower=1> N;                // total number of observations (student x items)
  int<lower=1> K;                // number of groups
  int<lower=1, upper=I> ii[N];   // variable indexing the items
  int<lower=1, upper=J> jj[N];   // variable indexing the students
  int<lower=0, upper=1> y[N];    // binary variable outcome

  int<lower=0, upper=K> kk[J];   // variable indexing the group membership
}

parameters {
  vector[I] beta[K];         // difficulty of item
  vector[J] theta_norm;                 // student ability
  vector[K] mu;                    // mean theta of every group
  real<lower=0> sigma;             // hyperprior theta
}
transformed parameters {
  vector[J] theta;
  theta = mu[kk] + theta_norm * sigma;
}
model {
  for (k in 1:K) {
    beta[k] ~ normal(0, 1);  // prior on the beta
  }

  mu ~ std_normal();
  theta_norm ~ std_normal();
  sigma ~ exponential(.1);         // hyperprior theta

  for (n in 1:N) {
    y[n] ~ bernoulli_logit(theta[jj[n]] - beta[kk[jj[n]], ii[n]]);
  }
}

Please note that there is one change in the data. kk is now a vector of length J. For the simulated data i just took the appropriate subset of your long data, long[seq_len(2 * n), "group"].

library(pcIRT)
n <- 4000

sim_data <- simDRM(c(-0.4, 0.2, 0.2), persons = n)
data <- sim_data$datmat

sim_data2 <- simDRM(c(-0.8, 0.4, 0.4), persons = n)
data2 <- sim_data2$datmat

data_new <- rbind(data, data2)
data_new <- as.data.frame(data_new)
names(data_new) <- c("y1", "y2", "y3")
data_new$X <- 1:nrow(data_new)

# Placing the data in long format and defining a numeric identifier for items
library(reshape)
long <- reshape::melt(data_new, id.vars = "X", variable.name = "variable", value.name = "value")
long$group <- rep(1:2, each = n, times = 3)
key <- 1:length(unique(long$variable))
names(key) <- unique(long$variable)
long$item.id <- key[long$variable]

stan_data <- list(I     = max(long$item.id),
                  J     = max(long$X),
                  N     = nrow(long),
                  K     = 2,
                  ii    = long$item.id,
                  jj    = long$X,
                  kk    = long[seq_len(2 * n), "group"],
                  y     = long$value)

I did not do much validation of the code, but the beta parameters seem to be recovered just fine.

I hope this helps. Best regards,
Philipp

1 Like