Reducing brms model output size for predictive function

liamkendall · July 17, 2018, 11:25pm

Hi there,

I want to use brms model fits within an R package function I am writing so that users can get predicted values with their new data. Each model fit object is, however, ~14-15MB and with a total of 6 models this makes the sysdata.R file very large (~52MB when compressed).

Can anyone provide any advice about how I could reduce the file size of each model object whilst retaining its ability to predict with new data?

Thanks in advance!

Liam

Please also provide the following information in addition to your question:

Operating System: Mac OS X High Sierra/Windows 10
brms Version: 2.3.6

paul.buerkner · July 17, 2018, 11:27pm

For what purpose exactly do you want to store the brms models in an R package?

Asked differently, what is the scope of your R package?

liamkendall · July 17, 2018, 11:33pm

The purpose of the R package is to provide predictive models for estimating pollinating insects’ body size. The function we have built provides three model options, depending on the user’s data/hypotheses, for prediction in two pollinating taxa, bees or hoverflies (so total of 6 brms fits). I want to use the brms fits within the function so it returns estimates as well as the S.E. and 95% CI’s.

paul.buerkner · July 18, 2018, 2:46pm

You can set save_dso = FALSE when fitting your model. The rest of the size are mainly the posterior samples so you can basically try to store less posterior samples at the expense of potentially estimating a little bit less accurate.

liamkendall · July 18, 2018, 11:26pm

Thanks for the tips @paul.buerkner, I did think I may be asking too much but I will give save_dso =FALSE a try and see how it goes to begin with.

Bob_Carpenter · July 19, 2018, 12:39am

If your models are fixed in terms of the Stan code they generate, you can use brms to generate the Stan model, then deploy them with the R skeleton. See the vignette in rstantools.

The advantage is that users will get a pure binary install and won’t need to find a C++ toolchain and install all of RStan.

liamkendall · July 20, 2018, 2:18am

Hi @Bob_Carpenter thank you for the suggestion and the helpful link - will this require that the function runs the model internally as opposed to loading the model object?
Perhaps a better question is will a stan model object be significantly smaller than a brms fit object? Based off the system time for fitting the models, running it internally within the function isn’t feasible.

Bob_Carpenter · July 20, 2018, 11:30am

I’m not sure how it works at the technical level. @bgoodri and @jonah will.

From a users perspective, all the models are compiled and downloaded in binary form with the package so that the user doesn’t need a C++ toolchain.

I don’t know what’s in a brms fit object, either. The RStan fit object contains a bunch of stuff you probably won’t need for inference—you can probably get away with just pulling out the draws for parameters you care about and save those. It does let you specify the parameters to save.

If the fit objects are too big, how many draws are they? Usually you don’t need a super large n_eff for inference and we often see people running 10s or 100s of thousands of iterations. So reducing number of draws is another way to economize on fit size.

paul.buerkner · July 20, 2018, 11:34am

I don’t think precompiling the stan model will solve the problem here, since the aim seems to be to have pre-fit models just to run post-processing on them. Precompiling would just help with fitting the models but won’t reduce size of the model itself.

Topic		Replies	Views
Reducing large model file size brms	7	992	July 31, 2021
Creating an R package using brms brms	4	969	July 29, 2019
Stancode in brms brms	5	804	July 16, 2020
Cannot save brms object in R for file size reasons General brms	3	432	February 2, 2024
Brms paper published Publicity	2	1245	September 23, 2017

Reducing brms model output size for predictive function

Related topics