Syntax for IRT model for detecting cheating

Hi,

I am trying to fit some version of the following model discussed in the following paper.

I put below a simplified version of what I am trying to achieve and representing where I am stuck. I appreciate if someone can help.

This is a some kind of IRT model. There are two ability parameters for each person, each ability operates based on an indicator variable provided by the interaction of T and F, T vector and F vector are provided by the user.

For instance, if both T and F are equal to one then y[i,j] follows bernoulli_logit(p2) which is based on theta_c and y[i,j] follows bernoulli_logit(p1) which is based on theta_t for every other combination, {0,1},{1,0},{0,0}.

I believe Stan doesn’t like and recognize the sum of two distributions following y[i,j]. It says it is expecting “;” instead of “+”.

I am kind of new to Stan language. Any suggestion what is the best way to specify this model?

Thank you.

model {
  real p1;
  real p2
  vector[I] T;  
  vector[J) F;
 
  T[1] = 0;
  T[2] = 1;
  T[3] = 0;
   ...
   ...
   ...
   T[I]  = 1;

  F[1] = 0;
  F[2] = 1;
  F[3] = 0;
   ...
   ...
   ...
   F[J]  = 1;

  p1 = a[j]*(theta_t[i]-b[j]);
  p2 = a[j]*(theta_c[i]-b[j]);

  for(i in 1:I) {
   for(j in 1:J) {
       ind = T[i]*F[j]
       y[i,j] ~  bernoulli_logit(p1)*(1-ind)  +  bernoulli_logit(p2)*ind
} 

Ok, maybe, it just required to post it here and then the light comes to you. I will reply to myself, but I don’t know if it will work. Let me know what you think, or if there is a more efficient way of doing this.

model {
  vector[I] T;  
  vector[J) F;
 
  T[1] = 0;
  T[2] = 1;
  T[3] = 0;
   ...
   ...
   ...
   T[I]  = 1;

  F[1] = 0;
  F[2] = 1;
  F[3] = 0;
   ...
   ...
   ...
   F[J]  = 1;

  for(i in 1:I) {
   for(j in 1:J) {
      real p;
      int ind;
      real theta;

      ind = T[i]*F[j]
      theta = theta_t[i]*(1-ind)+theta_c[i]*ind
      p = a[j]*(theta-b[j]);
       y[i,j] ~  bernoulli_logit(p)
} 

That is the right idea but you can and should declare and define on the same line when possible

real ind = T[i]*F[j];
real theta = theta_t[i]*(1-ind)+theta_c[i]*ind
real p = = a[j]*(theta-b[j]);

Thank you.

Here is a more compact version that worked for me this morning (I am also feeding the indicator matrix as a dataset). This is not the original model proposed, because the original paper treats the elements of T matrix unknown (we don’t know who is cheater and which items are compromised). I wrote this for a specific dataset I am working on and I actually know the element of T matrix. Next step would be write to model and estimate the element of T as well in Stan. I will post it here for reference if I can make that one work.

dgirt <- ' 

	data {     
		int <lower=0> I;                         //  number of individuals     
		int <lower=0> J;                         //  number of items     
		int <lower=0,upper=1> Y[I,J];     //  matrix of item responses 
            int <lower=0,upper=1> T[I,J];         //  matrix of indicator variables,
                                                                 //  equals 1 if the cheater responds to a compromised item,
                                                                //  0 otherwise.
	}    

	parameters {     
		vector[J]              b;                            //item difficulty parameter 
		vector <lower=0> [J]   a;                        //item discrimination parameter 
		vector[I]              theta_t;                      // true latent trait  
	        vector[I]              theta_c;                      // cheater latent trait 
	}    

	model {     
		b           ~ normal(0,10);
		a           ~ lognormal(0,.5);
		theta_t     ~ normal(0,1);  
		theta_c     ~ normal(0,1);  

		for (i in 1:I){       
		for (j in 1:J){
			real ind = T[i,j];
			real theta = theta_t[i]*(1-ind)+theta_c[i]*ind;
                        real p = a[j]*(theta-b[j]);
			Y[i,j] ~ bernoulli_logit(p);
		}  
		} 
      }
'