- Operating System: Windows 10
- brms Version: 2.10.0
I am trying to run a large three-level model in brms for data of the following size:
Level 1 = 276,811 observations
Level 2 = 44,602 persons
Level 3 = 133 countries
In case it matters, here is the full command:
brm(
formula = bf(
y ~ 1 + x * w + (1 + x | person/country),
zi ~ 1 + x * w + (1 + x | person/country)
),
family = zero_inflated_beta(link = "logit", link_phi = "log", link_zi = "logit"),
prior = c(
set_prior("normal(0, 1)", class = "b"),
set_prior("student_t(3, 0, 5)", class = "b", dpar = "zi"),
set_prior("logistic(0, 1)", class = "Intercept", dpar = "zi"),
set_prior("gamma(0.01, 0.01)", class = "phi"),
set_prior("student_t(3, 0, 1)", class = "sd"),
set_prior("lkj(1)", class = "L")
),
data = dat,
chains = 8,
cores = 8,
iter = 5000
)
When I try to estimate this model on my PC with 32 GB of RAM, I get the following error, which seems to indicate that I have run out of memory:
Error: cannot allocate vector of size 2.1 Mb
The Viewer window displays the following:
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 1).
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 2).
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 3).
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 4).
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 5).
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 6).
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 7).
SAMPLING FOR MODEL 'f140f29e2d2e79466a5386d0ca68843e' NOW (CHAIN 8).
[1] "Error : cannot allocate vector of size 39 Kb"
[1] "error occurred during calling the sampler; sampling not done"
[1] "Error : cannot allocate vector of size 39 Kb"
[1] "error occurred during calling the sampler; sampling not done"
[1] "Error : cannot allocate vector of size 39 Kb"
[1] "error occurred during calling the sampler; sampling not done"
[1] "Error : cannot allocate vector of size 39 Kb"
[1] "error occurred during calling the sampler; sampling not done"
[1] "Error : cannot allocate vector of size 39 Kb"
[1] "error occurred during calling the sampler; sampling not done"
[1] "Error in sampler$call_sampler(args_list[[i]]) : std::bad_alloc"
[1] "error occurred during calling the sampler; sampling not done"
[1] "Error : cannot allocate vector of size 39 Kb"
[1] "error occurred during calling the sampler; sampling not done"
Chain 4:
Chain 4: Gradient evaluation took 0.401 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 4010 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4:
Chain 4:
Chain 4: Iteration: 1 / 5000 [ 0%] (Warmup)
Does anyone have tips on running this kind of “big data” analysis with lots of observations and random effects? For example, do you think it would help to use fewer cores? Are there any tricks for using brm_multiple()
to combine subsets of the data?