Brms seems to omit option "cores" while fitting model

Monika_Derda · April 29, 2023, 7:58pm

Hello, I am relatively new to Bayesian modeling, and recently encountered the following problem:

I wanted to see if I could speed up my computations and therefore I set the number of cores to 8 in brm function. However, it seems to have no effect: my CPU usage is around 24% (see the picture)

Plus this is how sampling goes:

Chain 1: Gradient evaluation took 0.013708 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 137.08 seconds.

Moreover, when I run the same script on our lab server (which in theory has 10x more cores than my laptop), with the cores set to 80 the model takes more a less the same amount of time to complete as on my laptop.
It does not matter if I run my script from R Studio or R directly.

My question is, should it be the case? Is there a way to speed up the computations (for example by using also graphic card)? What should I add to my code?

Or maybe with my relatively small data I will not see the difference and this is the best what I can get?

set.seed(872436) 
library (tidyverse)

fake_data <- data.frame(
  id = rep(rep(c(1:40), times=520))
) %>% 
  group_by(id) %>% 
  mutate(trial=c(1:520), 
         task = case_when(
           trial %in% c(1:260) & id %in% c(1:20) ~ "low", 
           trial %in% c(261:520) & id %in% c(1:20) ~ "high",
           trial %in% c(261:520) & id %in% c(21:40) ~ "low", 
           trial %in% c(1:260) & id %in% c(21:40) ~ "high",
         ), 
         rating = rep(sample(c(1:4)), times=130)) %>% 
  group_by(id, task) %>% 
  mutate(pre_rating = lag(rating)) %>%
  filter(!is.na(pre_rating)) %>% 
  mutate(pre_rating=factor(pre_rating, labels=
                             c("a", "b", "c", "d")))
  

# Bayesian Estimation

library(sm)
library(brms)
library(loo)
library(beepr)
library(parallel)

options(mc.cores = parallel::detectCores())

# Model example
m <- brm(formula = bf(rating ~ pre_rating * task + (1|id)),
              data = fake_data,
              family = cumulative(link = "probit", threshold = "flexible"),
              sample_prior = TRUE,
              chains = 2,
              iter = 10000,
              cores = 8,
              warmup = 5000, 
              file = "test")

Laptop: Lenovo Yoga 720-15IKB, Intel(R) Core™ i7-7700HQ CPU @ 2.80GHz 2.80 GHz, 16 GB RAM
CPU:
Operating System: Windows 10 Home x64
R.version:
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
crt ucrt
system x86_64, mingw32
status
major 4
minor 3.0
year 2023
month 04
day 21
svn rev 84292
language R
version.string R version 4.3.0 (2023-04-21 ucrt)
nickname Already Tomorrow
R Studio Version
$mode
[1] “desktop”
$version
[1] ‘2023.3.0.386’
$long_version
[1] “2023.03.0+386”
$release_name
[1] “Cherry Blossom”
brms Version: brms_2.19.0

sessionInfo()
R version 4.3.0 (2023-04-21 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.utf8
[2] LC_CTYPE=English_United Kingdom.utf8
[3] LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.utf8

time zone: Europe/Warsaw
tzcode source: internal

attached base packages:
[1] parallel stats graphics grDevices utils datasets
[7] methods base

other attached packages:
[1] beepr_1.3 loo_2.6.0 brms_2.19.0 Rcpp_1.0.10
[5] sm_2.2-5.7.1 lubridate_1.9.2 forcats_1.0.0 stringr_1.5.0
[9] dplyr_1.1.2 purrr_1.0.1 readr_2.1.4 tidyr_1.3.0
[13] tibble_3.2.1 ggplot2_3.4.2 tidyverse_2.0.0

loaded via a namespace (and not attached):
[1] gridExtra_2.3 inline_0.3.19 sandwich_3.0-2
[4] rlang_1.1.0 magrittr_2.0.3 multcomp_1.4-23
[7] matrixStats_0.63.0 compiler_4.3.0 mgcv_1.8-42
[10] callr_3.7.3 vctrs_0.6.2 reshape2_1.4.4
[13] pkgconfig_2.0.3 crayon_1.5.2 fastmap_1.1.1
[16] backports_1.4.1 ellipsis_0.3.2 utf8_1.2.3
[19] threejs_0.3.3 promises_1.2.0.1 markdown_1.6
[22] tzdb_0.3.0 nloptr_2.0.3 ps_1.7.5
[25] jsonlite_1.8.4 later_1.3.0 prettyunits_1.1.1
[28] R6_2.5.1 dygraphs_1.1.1.6 stringi_1.7.12
[31] StanHeaders_2.26.22 boot_1.3-28.1 estimability_1.4.1
[34] rstan_2.26.22 audio_0.1-10 zoo_1.8-12
[37] base64enc_0.1-3 bayesplot_1.10.0 httpuv_1.6.9
[40] Matrix_1.5-4 splines_4.3.0 igraph_1.4.2
[43] timechange_0.2.0 tidyselect_1.2.0 rstudioapi_0.14
[46] abind_1.4-5 codetools_0.2-19 miniUI_0.1.1.1
[49] curl_5.0.0 processx_3.8.1 pkgbuild_1.4.0
[52] lattice_0.21-8 plyr_1.8.8 shiny_1.7.4
[55] withr_2.5.0 bridgesampling_1.1-2 posterior_1.4.1
[58] coda_0.19-4 survival_3.5-5 RcppParallel_5.1.7
[61] xts_0.13.1 pillar_1.9.0 tensorA_0.36.2
[64] checkmate_2.1.0 DT_0.27 stats4_4.3.0
[67] shinyjs_2.1.0 distributional_0.3.2 generics_0.1.3
[70] hms_1.1.3 rstantools_2.3.1 munsell_0.5.0
[73] scales_1.2.1 minqa_1.2.5 gtools_3.9.4
[76] xtable_1.8-4 gamm4_0.2-6 glue_1.6.2
[79] emmeans_1.8.5 projpred_2.5.0 tools_4.3.0
[82] shinystan_2.6.0 lme4_1.1-32 colourpicker_1.2.0
[85] mvtnorm_1.1-3 grid_4.3.0 crosstalk_1.2.0
[88] colorspace_2.1-0 nlme_3.1-162 cli_3.6.1
[91] fansi_1.0.4 Brobdingnag_1.2-9 V8_4.3.0
[94] gtable_0.3.3 digest_0.6.31 TH.data_1.1-2
[97] htmlwidgets_1.6.2 farver_2.1.1 htmltools_0.5.5
[100] lifecycle_1.0.3 mime_0.12 shinythemes_1.2.0
[103] MASS_7.3-58.4

jsocolar · April 29, 2023, 9:36pm

By default, brms uses at most one core per chain. Parallelizing across chains like this is embarrassingly parallel and will essentially always yield good speedup as long as you have enough memory. brms also contains functionality to parallelize within chains, but to use this you need to use the threads argument to brms::brm. The speedups here are more variable and sometimes this doesn’t help at all. For more, see Running brms models with within-chain parallelization

Topic		Replies	Views
Speed up brm function brms	4	4005	September 10, 2018
Problems running brms models with cores > 4 brms	9	1158	March 6, 2020
How do I set the `brms` arguments "threads" and "cores" correctly? brms	1	394	January 15, 2025
Paralellizing brms across both chains and fits brms performance , paralellization	3	1096	December 22, 2022
Between and within chain parallelization: threads and cores for multi vs. hyperthreading brms cmdstanr , paralellization	3	1652	July 15, 2024

Brms seems to omit option "cores" while fitting model

Related topics