Model summary vector memory limit reached

I’ve been using brms to run a multivariate model. It’s a complex model with a large dataset (c.80k respondents with repeated observations and nested and crossed factors and four separate predicted variables). It takes a long time to run but it gets there eventually. Last year (2020) I ran this model with an earlier version of cmdstan and brms and the model worked and I was able to view the model using the summary command. I have re-run it this year with the latest version of packages installed in R (brms with cmdstan backend) and the modelling completes but this time I run into a vector memory limit error when trying to get a summary of the model (see below). I’m not sure why. The model is the same, so I suspect that there has been a change to the way the summary command operates? Does anyone have any similar experience or advice? Is there anyway to limit what summary is trying to do? I guess I could reduce the model object matrix through some sort of post modelling thinning, but not sure how to set this up - would appreciate also any insights into how to do this prior to summary(mod) etc.

> summary(mod)
Error in paste(calltext, collapse = " ") : 
  result would exceed 2^31-1 bytes
Error in paste(deparse(object, width.cutoff = 500L), collapse = " ") : 
  result would exceed 2^31-1 bytes
Error in paste(deparse(object, width.cutoff = 500L), collapse = " ") : 
  result would exceed 2^31-1 bytes
Error in paste(deparse(object, width.cutoff = 500L), collapse = " ") : 
  result would exceed 2^31-1 bytes
Error in paste(calltext, collapse = " ") : 
  result would exceed 2^31-1 bytes
Error in paste(calltext, collapse = " ") : 
  result would exceed 2^31-1 bytes
Error in paste(deparse(object, width.cutoff = 500L), collapse = " ") : 
  result would exceed 2^31-1 bytes

Operating System: Ubuntu
brms Version: 2020 model: 2.13/2.14; Latest model: 2.16

2 Likes

I am not aware of any specific changes that could cause this (@paul.buerkner would be the person to ask). However, you may want to try to use the posterior package - it should let you convert the brms fit to a draws_array object, where you can easily subset parameters and perform summaries. You’ll loose a tiny bit of brms functionality and convenience, but hopefully you’ll be able to bypass the issue.

Alternatively, you may access the $fit object and use rstan::summary (where you can also subset parameters prior to summarising).

Hope that works out for you!

This may very well be because posterior::summarize_draws is not as memory efficient as we may want it to be. Would you mind opening an issue on GitHub - stan-dev/posterior: The posterior R package about this?