I fit multi-level regression models in R using brms, with id-level slopes and intercepts. I have data for many individuals (1000s) and models take a while to fit (5+ hours), so recently I’ve thought it might be useful to fit models in ‘batches’ of individuals, so that maybe I can run them in parallel on many machines/nodes or run them sequentially and save progress.
My initial thought was to fit the models on subsets of the data, and then use brms’ combine_models()
to combine them, but of course with the id-level parameters differing from batch to batch (because there are different IDs in there) I get Error: Models 1 and 2 have different parameters
.
Does anyone know of a good way to tackle something like this? Would it be valid to concatenate the posterior samples from each batch to brute force it, or is there a way to chain these models so the population-level parameters of the first batch become the priors for the second? Is there potential to bias results here depending on the order of fitting these batches?
As always, thanks in advance for any help advice you can offer on this.