'problem too large' when doing model comparison using loo on fitted brms models

Hi, I fitted two different ZINB models on count data representing gene expression. The only difference between the two models is the group-level effect. For one model, the grouping factor is gene. For the other model, the group factor is gene:cell type. Then I want to know which model has a better fit of the data. I tried loo(), loo_compare(), bayes_R2(), etc. All of them returned this error:

Error in h(simpleError(msg, call)) :
error in evaluating the argument ‘x’ in selecting a method for function ‘as.matrix’: Cholmod error ‘problem too large’ at file …/Core/cholmod_dense.c, line 105
Error: Something went wrong (see the error message above). Perhaps you transformed numeric variables to factors or vice versa within the model formula? If yes, please convert your variables beforehand. Or did you use a grouping factor also for a different purpose? If yes, please make sure that its factor levels are correct also in the new data you may have provided.

Does anyone know what is the problem? Is it because the dataset is too big? The file size of the two saved models is around 50MB and 250MB, respectively. There are 237541 data points in both data. The data points are from 157 genes and 1513 cells from 8 cell types. For the model with gene as grouping factor, there are 157 levels. For the other model, there are 157*8=1256 levels. And there is only one predictor and one offset in the model. For overdispersion and zero inflation parameter of the ZINB distribution, I chose them to be the same across all genes and all cell types. I checked the forum and it seems that this is not super big compared to other people’s models. Is there any input on what caused this problem?

If it is indeed due to the file/data size, is there any suggestion on how to reduce the file size of the fitted model and then do model comparison?

Thank you!

2 Likes

I tried to reduce the size of the dataset and run everything again. This time, all the model comparison functions work. So I guess the problem is data size. Is there any way to use model comparison functions on models fitted on a large dataset?

Also when I save the model object in R using saveRDS(), is there a way to reduce the file size but keep all the key information? By key information, what I mean is that if I load the object to a new R session, functions like posterior_predit and other standard functions will still work.

Are you able to share a reproducible example of the problem?

How much memory (RAM) is available on the system?

You can find the rds file of the fitted model at this link:

Simply load the file and the brms package. Then if you run loo() on the model, you should see the error.

I am running this on a new server that my lab just purchased, which should have pretty large RAM. I don’t know how to check the limit of RAM on a server…

One more piece of information, if I use pointwise=TRUE when running loo(), it won’t give me the error but it has been running for several hours and still not finished yet…

It quickly consumes all of my available RAM

Did you see the same error message? It is interesting that if I fit the same model using variational inference, then I can actually run loo() without a problem. It seems that it is not just dependent on the data size but also information generated during optimization.

How many chains and draws are you sampling?

I didn’t change any of these parameters so I believe they are at their default value.

How much ram do you have available?

loo needs the pointwise log likelihood values. With 237541 observations, 4 chains, 1000 draws and 8 bytes per value (this is from numpy, not R but I imagine things will be roughly the same), it means that you’ll need at the very least 7.6Gb available to store that data. On top of that you still have to perform the actual computation, PSIS… I don’t know if it’s possible to distribute that to avoid memory errors, we are working on it in at ArviZ, but loo hasn’t been adapted yet to allow distributed computing with dask, will hopefully happen this summer.

when running loo() try setting pointwise = TRUE in brms. it will run much slower but less memory is required.

3 Likes

I am running the code on a server which I assume should have pretty large ram. It is good to know that someone is working on this problem. Looking forward to your program!

1 Like

Thanks for this suggestion! I did set pointwise = TRUE and it finished in around one day. It is slow but better than nothing. Just curious, is there way to run loo() on subsampled data?

Another related question is that for a brms object fitted on a large dataset, when I save it using saveRDS() in R, the file size sometimes can be hundreds of MB. Is there a way to reduce the file size but keep all the key information? By key information, what I mean is that if I load the object to a new R session, functions like posterior_predit and other standard functions will still work.

1 Like

Try out loo_subsample.

With regard to the object size, most of it comes from the posterior draws which you usually cannot drop while still expecting post-processing to work. Other than that, I am not sure to be honest if something of relevant size can be dropped without causing problems.

2 Likes

Thank you! I will try loo_subsample.