Memory issues when computing LOO/WAIC

jonas_m · August 27, 2023, 8:18am

Hi!

Like several other people on this forum I am running into memory issues when trying to compute LOO/WAIC for large datasets. My data has roughly 160,000 observations. The model has three grouping levels, one with 15 categories, one with 16 categories, and one with 240 (15 * 16) categories, along with 5-30 population-level parameters. I am running the model with the usual four chains with 2,000 iterations each and discard 1,000 as burn-in. I am trying to compute LOO/WAIC for model comparison on a machine with 50 Gb memory. Given that I consistently run out of memory before the calculation is done, my question is this: Is there a rule of thumb to calculate how much memory I would need to compute these quantities? I could try to find a machine with more memory. Alternatively, is there a way to calculate the quantities on a subset of the samples? This option seems a little more hackish, but I’m running out of ideas. Any help would be greatly appreciated.

ssp3nc3r · August 27, 2023, 10:57pm

Here’s a vignette showing how to calculate loo for large data: Using Leave-one-out cross-validation for large data • loo

avehtari · August 28, 2023, 10:13am

@ssp3nc3r already linked to the vignette (which also lists the papers showing that properly made subset LOO is not justified and not hacky)

How many observations you have per group (with unique category combination)? If that is large, and the posterior is not changing much when you leave out just one observation out of 160,000, then it is possible you don’t even need LOO (or WAIC). On average, you have 531 observations per parameter, which is a lot, and 666 observations per each of the 240 categories is a lot, too, but then it matters if some categories have much fewer observations.

Topic		Replies	Views
LOO-CV for large data sets Modeling loo	1	371	May 29, 2021
Error in computing WAIC for a big dataset Modeling loo	5	1425	November 14, 2017
Waic or Model Comparison for a big (hierarhical) model: memory efficient methods? brms loo	7	2926	July 9, 2019
Loo 2.0 with very large stanfit crashes R Modeling loo	6	1308	February 1, 2022
Memory requirements to run loo.brmsfit brms	9	1065	November 14, 2023

Memory issues when computing LOO/WAIC

Related topics