Hi everyone,
I am a beginner using Rstan package and loo package. I have a quick question for using loo package. I want to get loo and waic for the purpose of model comparison. In my stan code, I define a variable called log_lik for Loo computation in the generated quantities block. (although it is not a parameter) I made it as a vector because I use long format data to delete missing responses.
generated quantities {
vector[N] log_lik;
real deviance;
for (n in 1:N)
log_lik[n] = pcm(y[n], theta[pp[n]], to_vector(delta[ii[n]]));
deviance = sum(-2*log_lik);
}
In my current model (crossed random effects in IRT model), I simulated the data for 500 persons and 60 items (N=30,000 data points in total) and fit the model with 1000 iterations and 4 chains. And then, I got large size stanfit output (482.2 Mb). Due to define ‘log_lik’ variable in stan code, my stanfit output size was much larger than expected.
The log_lik array in the output looks like 500 (sampling draw after warmup) X 4 (chains) X 30,000(data points), which is very huge. I tried to extract pointwise log-likelihood values from the stan output, but I failed to extract them with an error message below (neither merge_chains = FALSE nor TRUE).
log_lik ← loo::extract_log_lik(stanfit, “log_lik”)
Error: cannot allocate vector of size 457.8 Mb
-
Is there any solution to deal with large stanfit size for computing loo and waic?
-
Is it okay to compute loo and waic based on the log-likelihood values from just 1 or 2 chains rather than all 4 chains?
-
If okay, how can I extract 1 or 2 chains’s values from the stan ouput and get loo?
-
After calculating loo and waic, I want to remove log-likelihood draws from the output object to reduce the file size. Is there any simple way to remove them?
Thanks for any help in advance! Have a great weekend.
Best,
JinHo