Incorrect recovery using LOO and WAIC fit indices

Hi Stan users,

I’m trying to recover the number of latent groups using model comparison approach. The generated data set is based on a mixture IRT model with 2 latent groups.
In order to decide the best fitting model, three mixture models were fitted; the
1-group model, the 2-group model, and the 3-group model. The results showed the values of LOO and WAIC increased with the number of groups.

This means the best fitting model is the 1-group model although it took large number of iterations to converge (warm-up=10000, iteration=12000, chains=12), while the other two solutions converged quickly with only 3000 warm-up, followed by 3 chains each with 5000 iterations.

It does not make sense to conclude that the LOO and WAIC selected the one-group model as the best fitting model, and hence these fit indices failed to recover the number of latent groups. In previous research, the recovery was 100% correct. Now, I’m getting 100% incorrect recovery across the 10 replications.

Here is the computation code for the log-likelihood for each person, for each item:

generated quantities{

real p ;
vector[J]log_lik[N] ;

for (i in 1:N){
for (j in 1:J){

p = inv_logit( alpha[j]*(theta[i]-beta[j]) ) ;

log_lik[i,j] = bernoulli_lpmf(y[i,j] | p) ;

}}

}

where i = 1,…, N indicates persons
j = 1,… , J indicates items

beta: item difficulty parameter
alpha: item discrimination parameter
theta: person ability parameter
y[i,j]: responses of person i on item j

It seems I’m having something wrong in the code probably. Can anyone help me in this issue?

Thank you

Hi @Rehab,

Can you be more specific which values do you mean? In case of elpd_loo (expected log predictive density / log score) given by loo package larger value is better.

To check whether log_lik is computer correctly, we need to see also the model block. Usually whole model is needed to understand the model block easily (e.g. to see which variables are data and which are parameters)

Hi avehtari,

Thanks for your reply.

The code below is for the mixture 2PL IRT model, for the 2-group solution, where “K = 2” in the code means fit the 2-group solution.

y = resp
K = 2
N = nrow(y)
J = ncol(y)
dir_alpha = c(1 , 1)

modelString="data {
int<lower=1> K ;
int<lower=1> N ;
int<lower=1> J ;
int<lower=0,upper=1> y[N,J] ;
vector<lower=0>[K] dir_alpha ;
}

parameters {
simplex[K] pi ;
vector[N] theta ;
ordered[K] mu ;
ordered[K] beta[J] ;
vector<lower=0>[K] alpha[J] ;
}

model {
real lps[K] ;
real p[K] ;
real lpth[K] ;

for (k in 1:K){
mu[k] ~ normal(0, 1) ;
}
pi ~ dirichlet(dir_alpha) ;

for (j in 1:J){
for (k in 1:K){
beta[j,k] ~ normal (0,1) ;
alpha[j,k] ~ normal(0,1) ;
}}

for (i in 1:N){
for (k in 1:K){
lpth[k]= log(pi[k]) + normal_lpdf(theta[i] | mu[k], 1) ;
}

target += log_sum_exp(lpth ) ;
}

for (i in 1:N){
for (j in 1:J){
for (k in 1:K){
p[k] = inv_logit( alpha[j,k]*(theta[i]-beta[j,k]) ) ;

lps[k]= log(pi[k]) + bernoulli_lpmf(y[i,j] | p[k]) ;
}
target += log_sum_exp(lps) ;
}}

}

generated quantities{

real p[K] ;
vector[J]log_lik[N] ;

for (i in 1:N){
for (j in 1:J){
for (k in 1:K){
p[k] = inv_logit( alpha[j,k]*(theta[i]-beta[j,k]) ) ;

log_lik[i,j] = log( exp(log(pi[k]) + bernoulli_lpmf(y[i,j] | p[k])) ) ;

}}}

}
"

stanDso <- stan_model( model_code=modelString )

Mix_2C = sampling( object=stanDso, data=list(K,N,J, y, dir_alpha ), warmup = 3000, iter = 5000, chains = 3 )

The results is as follows:

  1. one-group solution, K =1 (which means fitting the conventional 2PL IRT)

loo1
Computed from 10000 by 30000 log-likelihood matrix

     Estimate           SE

elpd_loo -15151.2 90.5
p_loo 917.0 11.8
looic 30302.4 181.1

Monte Carlo SE of elpd_loo is 0.4.

Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 29991 100.0% 1875
(0.5, 0.7] (ok) 9 0.0% 1155
(0.7, 1] (bad) 0 0.0%
(1, Inf) (very bad) 0 0.0%

All Pareto k estimates are ok (k < 0.7).
See help(‘pareto-k-diagnostic’) for details.

  1. The two-group solution (K = 2) (This is the correct solution)

loo2
Computed from 6000 by 30000 log-likelihood matrix

     Estimate            SE

elpd_loo -43311.6 235.7
p_loo 7969.5 141.4
looic 86623.3 471.4

Monte Carlo SE of elpd_loo is NA.

Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 24692 82.3% 296
(0.5, 0.7] (ok) 2538 8.5% 62
(0.7, 1] (bad) 1831 6.1% 11
(1, Inf) (very bad) 939 3.1% 2
See help(‘pareto-k-diagnostic’) for details.

  1. The 3-group solution (K = 3)

loo3
Computed from 6000 by 30000 log-likelihood matrix

     Estimate           SE

elpd_loo -50537.5 301.5
p_loo 11095.7 192.6
looic 101074.9 603.1

Monte Carlo SE of elpd_loo is NA.

Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 23163 77.2% 465
(0.5, 0.7] (ok) 2615 8.7% 102
(0.7, 1] (bad) 2211 7.4% 13
(1, Inf) (very bad) 2011 6.7% 1
See help(‘pareto-k-diagnostic’) for details.

For the WAIC results:

  1. one-group solution:

waic1
Computed from 10000 by 30000 log-likelihood matrix

      Estimate    SE

elpd_waic -15142.2 90.4
p_waic 908.0 11.6
waic 30284.4 180.8

  1. Two-group solution:

waic2
Computed from 6000 by 30000 log-likelihood matrix

      Estimate    SE

elpd_waic -41294.6 199.2
p_waic 5952.4 100.6
waic 82589.2 398.4

  1. three-group solution:

waic3
Computed from 6000 by 30000 log-likelihood matrix

      Estimate    SE

elpd_waic -46950.6 240.8
p_waic 7508.8 128.6
waic 93901.1 481.7

Sorry I’m not familiar with “2PL IRT” so I have to ask more questions. What are the observations? I can see y has dimensions NxJ, but what are N and J? Do you want to leave out y[i,j]' or y[i,]ory[,j]`? Could you also intent the code for easier reading? It’s now difficult to see what lines are inside each loop.

I think in generated quantities you should declare

matrix[N,J] log_lik1;
vector[N*J] log_lik;

then inside the loop have

log_lik1[i,j] = ...something...;

and after the loop

log_lik = to_vector(log_lik1)

which is good format for loo package.
In addition it seems that now inside your for loop

log_lik[i,j] = log( exp(log(pi[k]) + bernoulli_lpmf(y[i,j] | p[k])) 

is re-assigning log_lik[i,j] K times, and the final values is comouted using p[K], while in model block you are collecting these terms to lps[k] and log_sum_exp:ing them.

y[i,j] means responses of ( examinees i = 1, …, N ) to a set of ( items j = 1,…, J )
I think I should leave out a pair of (examinee, item) each time. I’m not sure.

As far as I understand, I’m collecting them in the model block in order to compute the log of the target posterior distribution. While in the generated block, we need the log likelihood for each pair of (examinee, item). Or should I also collect them in the generated block?

K means that there is k subpopulations under the unified population. It is similar that my population consists of people who have k distincts features. For example, low ability and hight ability groups.

I’ll modify the code and add this part.

Thank you for taking the time to look at my code.

I think you have an error in the generated block, by ignoring terms 1:K-1 and assigning only the K’th term to log_lik[i,j].

You can use the same code to compute log_lik[i,j] in model and generated quantities blocks. In model block you need to then sum all log_lik values, while in generated quantities you either

  • make it vector if you want to calculate leave one observation away
  • sum over i if you want to leave out one item
  • sum over j if you want to leave out one examinee

Sorry, I’m not sure if I understand your point. May you please indicate this error in the code.
I used the loop from k =1 to K in order to consider all the values of k.

I’m wondering if the number of iterations across the three models should be same. That is, is it necessary for the matrix of (draws by observations) to be the same across the fitted models?

Thank you

Do you mean the log_lik should be written as below:

vector[K] log_lik[N,J] ;

log_lik[i, j, k ] = log( exp(log(pi[k]) + bernoulli_lpmf(y[i,j] | p[k])) ) ; ?

and how can I sum over k ? Because I need to leave out one observation y[i,j] ?

In model code you also use all the terms in the loop, but in the generated quantities you reassign only the last value. It would probably be easier to see this if you would intend your code (it is easier for you and others to read your code).

In the following for-loop, you compute p[k] every round and use it to compute log_lik[i,j] every round again

Thus after K’th iteration log_lik[i,j] is assigned a value log( exp(log(pi[K]) + bernoulli_lpmf(y[i,j] | p[K])) ) ; (with capital K). Compare this to what you do in the model block and just copy what you have there for log_lik[i,j].

Now it works. The recovery of the true model is correct across replications.

I understand my mistake. As you said, only the K’th term was assigned to log_lik[i,j] in the generated block.

Many thanks Dr. Vehtari

1 Like