Large brms model runs out of memory

jmgirard · January 9, 2020, 6:22pm

Operating System: Windows 10
brms Version: 2.10.0

I am trying to run a large three-level model in brms for data of the following size:

Level 1 = 276,811 observations
Level 2 = 44,602 persons
Level 3 = 133 countries

In case it matters, here is the full command:

brm(
  formula = bf(
    y ~ 1 + x * w + (1 + x | person/country),
    zi ~ 1 + x * w + (1 + x | person/country)
  ),
  family = zero_inflated_beta(link = "logit", link_phi = "log", link_zi = "logit"),
  prior = c(
    set_prior("normal(0, 1)", class = "b"),
    set_prior("student_t(3, 0, 5)", class = "b", dpar = "zi"),
    set_prior("logistic(0, 1)", class = "Intercept", dpar = "zi"),
    set_prior("gamma(0.01, 0.01)", class = "phi"),
    set_prior("student_t(3, 0, 1)", class = "sd"),
    set_prior("lkj(1)", class = "L")
  ),
  data = dat,
  chains = 8,
  cores = 8,
  iter = 5000
)

When I try to estimate this model on my PC with 32 GB of RAM, I get the following error, which seems to indicate that I have run out of memory:

Error: cannot allocate vector of size 2.1 Mb

The Viewer window displays the following:

SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 1).
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 2).
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 3).
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 4).
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 5).
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 6).
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 7).
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 8).
[1] "Error : cannot allocate vector of size 39 Kb"
[1] "error occurred during calling the sampler; sampling not done"
[1] "Error : cannot allocate vector of size 39 Kb"
[1] "error occurred during calling the sampler; sampling not done"
[1] "Error : cannot allocate vector of size 39 Kb"
[1] "error occurred during calling the sampler; sampling not done"
[1] "Error : cannot allocate vector of size 39 Kb"
[1] "error occurred during calling the sampler; sampling not done"
[1] "Error : cannot allocate vector of size 39 Kb"
[1] "error occurred during calling the sampler; sampling not done"
[1] "Error in sampler$call_sampler(args_list[[i]]) : std::bad_alloc"
[1] "error occurred during calling the sampler; sampling not done"
[1] "Error : cannot allocate vector of size 39 Kb"
[1] "error occurred during calling the sampler; sampling not done"
Chain 4: 
Chain 4: Gradient evaluation took 0.401 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 4010 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:    1 / 5000 [  0%]  (Warmup)

Does anyone have tips on running this kind of “big data” analysis with lots of observations and random effects? For example, do you think it would help to use fewer cores? Are there any tricks for using brm_multiple() to combine subsets of the data?

bgoodri · January 9, 2020, 6:53pm

Yes

jmgirard · January 14, 2020, 3:11pm

Moving to a machine with 64 GB of RAM and reducing cores to 2 got me around the memory issue. However, the model is now running very, very slowly. Is anyone aware of any approaches to Bayesian estimation with “big data” that may increase the efficiency, perhaps at the cost of some accuracy (e.g., via approximation)?

Topic		Replies	Views
Brms memory (RAM) overload brms brms	5	683	August 17, 2023
Brms limited memory issue while running on 15M data points Modeling brms	9	1025	July 28, 2021
Model summary vector memory limit reached brms	2	841	December 1, 2021
Posterior_average Modeling brms	0	384	October 30, 2023
Brm model running time brms	9	261	August 1, 2024

Large brms model runs out of memory

Related topics