Using bernoulli_logit_glm_lpmf for IRT model

So I am trying to make an IRT model where the likelihood component looks something like this:

data {
  //Ideal Point Component
  int<lower=0> I;                       // number of items
  int<lower=0> J;                       // number of individuals
  int<lower=1> D;                       // number of dimensions
  int<lower=0> N;                       // number of observations
  int<lower=0,upper=I> ii[N];           // item for observation n
  int<lower=0,upper=J> jj[N];           // group for observation n
  int<lower=0,upper=1> y[N];            // position for observation n
}
parameters {
  //Ideal Point Component
  matrix[J,D] theta;          // group parameter matrix (group ideal point)
  vector[I] beta;             // vector of difficulty  (intercept)
  matrix[D,I] alpha;          // discrimination matrix  (slope)
}
model {
//Prios go here...just not adding them in example
  for (n in 1:N){
     target += bernoulli_logit_lpmf(y[n]|theta[jj[n]]*col(alpha,ii[n]) - beta[ii[n]]);
 }
} 

But from my understanding the bernoulli_logit_glm_lpmf and vectorizing the above model would make it more computationally efficient. I am struggling a little to figure out how the notation for the IRT model would work with the bernoulli_logit_glm_lpmf framework (int[] y | matrix x, vector alpha, vector beta).

I would appreciate any suggestions! Thanks in advance, and please let me know if more information is needed.

Leo

In this github IRTStan I have functions and Stan code to run unidimensiona dn multidimensional IRT models (2PL, 3PL, and GRM).
This code is done to work with wide data, and the multidiemsnioanl models work when each item loads only into 1 actor at the time
You can also see in this paper my simulation evaluating these models across sample sizes and number of factors

My code uses the expression

for(i in 1:N){
for(j in 1:n){
x[i,j] ~ bernoulli_logit(a[j] * (theta[i] - b[j]) );
}
}

Which you could translate to use the lpmf, like

for(i in 1:N){
for(j in 1:n){ 
target =+ bernoulli_logit_lpmf(x[i,j] | a[j] * (theta[i] - b[j]) );
}
}

hope this helps

2 Likes

Hi Mauricio,

Thanks, I will definitely take a look at the github and the paper!
The model as written (plus priors) already works fine on simulated data but given that I will be running this in relatively large data I am trying to get some efficiency gains. That is why I was wondering if it was possible to do it vectorized using bernoulli_logit_glm_lpmf instead of the bernoulli_logit_lpmf and the loop. Maybe I miss understood something, but the solutions you gave should not be more efficient than the current formulation.

Thank you again!
PS: I can share the full model privately if it will help!

Easy step is to replace

target += bernoulli_logit_lpmf(y[n]|theta[jj[n]]*col(alpha,ii[n]) - beta[ii[n]]);

with

target += bernoulli_logit_glm_lpmf(y[n] | theta[jj[n]], -beta[ii[n]], col(alpha,ii[n]));

Getting rid of the for loop seems more challenging, as the product theta[jj[n]]*col(alpha,ii[n]) is row vector times column vector, and both of these vectors depend on on n, while glm function would assume one of those parts to not depend on n. As there is some overhead in calling _lpmf, it might be also faster to compute theta[jj[n]]*col(alpha,ii[n]) - beta[ii[n]] for all n in a for loop and collect to a vector eta, and then call bernoulli_logit_lpmf(y | eta) only once.

It might be possible to switch from loop (n in 1:N) to loop (i in 1:I) or (j in 1:J).

1 Like

Hi @avehtari ,

Thanks for the recommendations!
Just some updates on your points.
So I tried the eta suggestion and there seems to be little difference in how long the model runs (if anything it is a bit slower).
Using the target += bernoulli_logit_glm_lpmf(y[n] | theta[jj[n]], -beta[ii[n]], col(alpha,ii[n])); didnt work as theta[jj[n]] is a row vector and the glm_ requries a matrix.
I am still trying to see if I can get the loop to be (i in 1:I) or (j in 1:J) but no luck yet.

Thanks again!

Thanks for reporting this. It helps me to update my prior on how much repeated calls to lpmf could affect.

You can try with to_matrix(theta[jj[n]]), but it is possible that theta[jj[n]] and col(alpha,ii[n]) are too small to get a big benefit from glm, and the execution time is dominated by the for loop and indexing.

You could also try with _ulpmf instead of _lpmf, but that is probably having only a small effect, as the normalization term is quite fast to compute.

@Bob_Carpenter might have ideas for reducing the indexing (or how bad the current indexing can be for the performance)