Model comparison - large model

annptr · March 6, 2018, 10:08pm

Hi,
I am trying to compare two models using the loo package.
Number of data points = 600,000, post-warmup iterations = 2000, # chains = 10
To compute log likelihood from all samples, I need a matrix of size 600K x 20K. This would take very long time and memory.
Any recommendations to make this more efficient?
Can I only use a small number of iterations instead of all 2000? any other suggestions?

Thanks!

bgoodri · March 6, 2018, 11:00pm

The “pass a function that evaluates the log-likelihood of the i-th observation” method described at

For models fit to very large datasets we recommend the loo.function method, which is much more memory efficient than the loo.matrix method.

avehtari · March 7, 2018, 6:24pm

Ben’s suggestion is good, too. Here are couple other suggestions.

Take a smaller (whatever is fast enough for you) random sample of data points, compute log likelihood for those, compute elpd_loo for this smaller random sample and use the usual statistical inference to estimate what would be elpd_loo for the whole n=600K. We use this kind of approach succesfully in projpred to speed-up computation in case of large n.

You can also use less iterations (for example by thinning), but check N_eff and I recommend having N_eff>1000 for PSIS-LOO.

annptr · March 8, 2018, 12:48pm

Thanks @bgoodri and @avehtari.
Without rerunning the model with less iterations, can I get a smaller random sample of posterior draws and estimate the loglikelihood on the full data set and compare two models?

bgoodri · March 8, 2018, 2:35pm

You would have to manually throw away draws, which just makes the estimates less precise. Use the loo.function method.

annptr · March 8, 2018, 3:05pm

Okay, thanks! I will try that way.

Topic		Replies	Views
Slow Estimation of Log Likelihood Modeling loo	5	556	April 11, 2020
Model comparison questions (performance/accuracy evaluation) Modeling loo	10	2080	December 24, 2018
Model checking & comparison using loo vs loo_compare Modeling	8	2445	January 4, 2024
Loo_compare issue Modeling	5	1207	August 31, 2020
Any other efficient methods for model comparsion(or model identification), like the LOOIC Modeling rstan , loo	3	154	March 7, 2025

Model comparison - large model

Related topics