How to return brms model object without the dataframe used in training?

Hi all,

I want to know if there are ways to easily return a brms model object but without the data. We tried removing the data on the trained model object before saving it. However, when we tried to make prediction for a new dataset, we got this error message:
Error in eval(predvars, data, env) :
invalid ‘envir’ argument of type ‘character’
Calls: run … model.matrix.default -> model.frame -> model.frame.default -> eval
Execution halted

Our training datasets are in several Gb and it’s very challenging storing all trained brms model objects. We thought removing the data component as returned by the brms model could help.

Thanks for your insights!

  • Operating System: Ubuntu Ubuntu 18.04.2 LTS
  • brms Version: 2.8.9

For validation of the new data, the old data is currently required. This is because the R formula syntax is not unambiguous without the data. For instance y ~ x will mean something different depending on whether x is numeric or a factor. I am not sure what needs to be changed in order to the predictions work just with the new data but it will definitely be non-trivial.

1 Like

Maybe using brms::make_stancode() and then fitting your model using RStan or CmdStan would work? That would avoid the data being saved and result in much smaller model objects, but the downside is that you’d have to code up your own predictions (but that’s usually much easier than coding up your own Stan program).

1 Like

that would be a good solution indeed. it’s mostly that you can’t use the automatic prediction functions of brms without the old data. everything else should work.

1 Like

Hi Paul,

Thanks a lot for shedding more light on this.

Hi Jonah,

Thanks a lot for your insight! Will definitely experiment with that.