Hi Stan users,
I’m trying to recover the number of latent groups using model comparison approach. The generated data set is based on a mixture IRT model with 2 latent groups.
In order to decide the best fitting model, three mixture models were fitted; the
1-group model, the 2-group model, and the 3-group model. The results showed the values of LOO and WAIC increased with the number of groups.
This means the best fitting model is the 1-group model although it took large number of iterations to converge (warm-up=10000, iteration=12000, chains=12), while the other two solutions converged quickly with only 3000 warm-up, followed by 3 chains each with 5000 iterations.
It does not make sense to conclude that the LOO and WAIC selected the one-group model as the best fitting model, and hence these fit indices failed to recover the number of latent groups. In previous research, the recovery was 100% correct. Now, I’m getting 100% incorrect recovery across the 10 replications.
Here is the computation code for the log-likelihood for each person, for each item:
generated quantities{
real p ;
vector[J]log_lik[N] ;
for (i in 1:N){
for (j in 1:J){
p = inv_logit( alpha[j]*(theta[i]-beta[j]) ) ;
log_lik[i,j] = bernoulli_lpmf(y[i,j] | p) ;
}}
}
where i = 1,…, N indicates persons
j = 1,… , J indicates items
beta: item difficulty parameter
alpha: item discrimination parameter
theta: person ability parameter
y[i,j]: responses of person i on item j
It seems I’m having something wrong in the code probably. Can anyone help me in this issue?
Thank you