I’ve got the following 2PL IRT / latent space model which is taken from Bafumi, Joseph, Andrew Gelman, David K. Park, and Noah Kaplan. “Practical Issues in Implementing and Understanding Bayesian Ideal Point Estimation.” Political Analysis 13, no. 2 (April 1, 2005): 171–87.
The model is giving reasonable results but the number of effective samples is really low even with very high numbers of iterations. Is this normal? I was expecting a few more than 3 n_eff with 100,000 samples. Is there anything that can be done?
There are J = 354 respondents, K = 6 questions, with respondents only answering 1.7 questions on average.
Are there better priors? Would putting Gamma(2, 0)s on the scales help?
Here’s the model:
data {
int<lower=1> K; // # questions
int<lower=1> J; // # respondents
int<lower=1> N; // data items
int<lower=1, upper=K> kk[N]; // question for n
int<lower=1, upper=J> jj[N]; // respondent for n
int<lower=0, upper=1> y[N]; // agree/disagree n
vector[J] books_per_year_log10; // log10 of number of books read per year
}
parameters {
vector[J] alpha; // ability for respondent j
vector[K] beta; // difficulty for question k
vector[K] gamma; // discrimitiveness for question k
real delta_0; // prior intercept
real<lower=0> delta_1; // coef for books_per_year_log10, constrained to be positive to resolve reflection invariance
real<lower=0, upper=20> sigma_alpha;
real<lower=0, upper=20> sigma_beta;
real<lower=0, upper=20> sigma_gamma;
}
model {
vector[N] log_odds;
delta_0 ~ normal(0, 1);
delta_1 ~ normal(0, 1);
alpha ~ normal(delta_0 + delta_1 * books_per_year_log10, sigma_alpha);
beta ~ normal(0, sigma_beta);
gamma ~ normal(0, sigma_gamma);
for (n in 1:N) // likelihood
log_odds[n] = gamma[kk[n]] * (alpha[jj[n]] - beta[kk[n]]);
y ~ bernoulli_logit(log_odds);
}
generated quantities {
// adjust parameters to common scale as recommended in Bafumi, Gelman, Park, and Kaplan
vector[J] alpha_adj;
vector[K] beta_adj;
vector[K] gamma_adj;
alpha_adj = (alpha - mean(alpha)) / sd(alpha);
beta_adj = (beta - mean(alpha)) / sd(alpha);
gamma_adj = gamma * sd(alpha);
}
and a bit of the summary output
Inference for Stan model: anon_model_43a5c2b517d321af97808f8cab116c06.
4 chains, each with iter=100000; warmup=50000; thin=1;
post-warmup draws per chain=50000, total post-warmup draws=200000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
alpha[0] 0.31 1.11 2.93 -7.55 -0.51 0.65 1.85 4.75 7 1.17
alpha[1] -0.09 2.79 4.83 -13.87 -0.67 1.01 2.48 5.99 3 1.51
alpha[2] 0.11 4.27 7.4 -20.91 -0.47 1.89 4.01 9.53 3 1.6
alpha[3] 0.18 0.73 2.64 -6.64 -0.67 0.38 1.49 4.46 13 1.08
alpha[4] 1.02 2.26 5.06 -5.3 -1.35 -0.1 1.63 16.02 5 1.23
alpha[5] -0.49 4.29 7.43 -22.05 -0.69 1.47 3.33 8.29 3 1.58
alpha[6] 1.1 2.97 5.14 -4.57 -1.5 -0.35 1.41 16.25 3 1.49
alpha[7] 0.7 1.32 3.23 -3.45 -0.93 0.06 1.34 10.0 6 1.2
alpha[8] 0.46 1.25 3.55 -9.11 -0.43 0.82 2.16 5.98 8 1.14
alpha[9] 1.15 4.04 7.0 -6.76 -2.29 -0.82 1.26 21.84 3 1.53
alpha[10] 0.79 1.44 6.26 -15.83 -0.63 1.16 3.31 11.76 19 1.05
alpha[11] -0.65 4.39 7.6 -22.62 -0.81 1.46 3.31 8.24 3 1.66
alpha[12] 3.08 5.44 9.42 -5.27 -1.79 -0.53 2.14 30.8 3 1.82
alpha[13] 1.24 4.04 7.0 -6.59 -2.33 -0.9 1.37 21.56 3 1.68
alpha[14] -0.58 4.36 7.55 -22.31 -0.74 1.52 3.37 8.24 3 1.65
alpha[15] 0.43 1.28 6.15 -15.85 -0.95 0.8 2.82 11.39 23 1.04
alpha[16] -0.45 4.59 7.96 -23.25 -0.68 1.74 3.75 8.97 3 1.67
alpha[17] -0.45 4.27 7.4 -21.74 -0.66 1.5 3.38 8.28 3 1.59
alpha[18] 1.55 3.38 6.76 -5.56 -1.7 -0.41 1.54 21.85 4 1.45
alpha[19] 0.75 0.04 2.46 -3.87 -0.4 0.58 1.75 6.24 3096 1.0