Hi all,
I would like some help diagnosing the problems with a two-parameter item-response theory (IRT) model that does not converge. This model should be identical to the example 2PL model in the Stan reference manual (section 8.11, p. 140-141), with the difference that my observations are binomially distributed–they represent the number of people in a country c, year t who answered item j “correctly”.
Eventually, I would like to estimate random walks over time for each country. For diagnostic purposes, I pared things down to estimating a global latent variable score for each year. I have about 50 items, and about 20 years, and I observe 2000 country/year/item observations. All in all, my data set contains responses from 2.2 million people. There is much more data in some years than others. Here is the stan file:
data {
int<lower=0> TT; //number of years
int<lower=0> J; //number of items
int<lower=0> CTJ; //number of observed country-year-questions
int<lower=0> y[CTJ]; //positive responses per country-year
int<lower=0> n[CTJ]; //responses per country-year
}
parameters {
real mu_beta;
real alpha[TT]; //abilities
real beta[J]; //difficulty
real<lower=0> gamma[J]; //discrimination; should be positive
}
transformed parameters {
real<lower=0, upper=1> p[CTJ] ;
for(ctj in 1:CTJ){
p[ctj] = inv_logit( gamma[ questions[ctj] ] * ( alpha [ years[ctj] ] - beta[ questions[ctj] ] ) );
}
}
model {
//years are independent
for(t in 1:TT){
alpha[t] ~ normal(0, 1);
}
//weakly informative non-hierarchical priors
//for question difficulty and discrimination
beta ~ normal(mu_beta, 5);
gamma ~ lognormal(0, 2);
mu_beta ~ cauchy(0, 5);
//likelihood
y ~ binomial( n, p );
}
This model has a hard time converging. With 4 chains of 1000 iterations, of which 100 are warm-up, I get plenty of divergent transitions, and the median effective sample size for the alpha parameters is 4.2. When I look at the trace plots, 2 out of 4 chains have an acceptance rate of almost zero. This does not bode well for my plans to estimate a more complicated model on this data.
Happy to provide the data if necessary. Am I doing something wrong?
Thanks,
Clara