Syntax question: Constrain parameters by groups

Hello,

I am trying to fit an IRT model where the item parameters vary by group. I am following this post as a guide for my syntax. My goal is to restrict the betas (difficulty parameters) so that they sum to 0 for each group. This is my code at the moment:

data {
  int<lower=1> K;                // number of items
  int<lower=1> I;                // number of students
  int<lower=1> N;                // total number of observations (student x items)
  int<lower=1> J;                // number of groups
  int<lower=1, upper=K> kk[N];   // variable indexing the items
  int<lower=1, upper=I> ii[N];   // variable indexing the students
  int<lower=0, upper=1> y[N];    // binary variable outcome
  int<lower=0, upper=J> jj[I];   // variable indexing the group membership
}

parameters {
  vector[J] beta_raw[K-1];         // difficulty of item
  vector[I] theta_norm;             // student ability
  vector[J] mu;                    // mean theta of every group
  real<lower=0> sigma;             // hyperprior theta
}
transformed parameters {
  vector[I] theta;
  vector[K] beta[J];
  
  theta = mu[jj] + theta_norm * sigma;
  for (j in 1:J) {
    beta[j] = append_row(beta_raw[j], -sum(beta_raw[j]));
  }
}

model {
  for (k in 1:J) {
    beta[k] ~ normal(0, 1);  // prior on the beta
  }
  
  mu ~ std_normal();
  theta_norm ~ std_normal();
  sigma ~ exponential(.1);         // hyperprior theta
  
  for (n in 1:N) {
    y[n] ~ bernoulli_logit(theta[ii[n]] - beta[jj[ii[n]], kk[n]]);
  }
}

} 

The code works well as long as the number of groups (K) is 1 less than the number of items (I). However, under other configurations, for example, K>I, the indices do not quite work out. I would get an error stating that my index is out of range.

Any pointers on how to make the code above more general so that it can work with more configurations of K and I?

1 Like

In the code above, it looks like you may be swapping K and J when you define beta. I’m guessing that’s the source of the error.

Also, it looks like you could vectorize the bernoulli_logit sampling statement, which could help with performance.

2 Likes