Rstudio crashes just before producing a brms model

Hello, I am trying to run a brms model with 20000 iterations. After many hours of simulations, and just before being done, Rstudio crashes without any warning. The model compiles successfully with 1000 iterations and, sometimes, with 10000 iterations but several attempts to run the model with 20000 iterations or more were unsuccessful. I was able to see the message that all 4 chains have finished successfully and the total execution time. I also saw that after the chains had finished, RAM usage (8 GB) gradually increased from less than 50% to more than 90%. Any ideas about what causes the crash and how it could be prevented?

The model is the following:

bestMod_raw_dir <- bf(acc_recall ~ icpt + inter + rest,
                      icpt ~ 1, 
                      inter ~ 0 + lopNum : disruptNum, 
                      rest ~ 0 + lopNum + disruptNum + 
                          (1 | target) +
                          (1 + lopNum * disruptNum || ppt),
                      nl = TRUE, cmc = FALSE,
                      family = "bernoulli")

prior_neg = c(
           set_prior("cauchy(0, 10)", class = 'b', nlpar = "icpt"),
           set_prior('cauchy(0, 2.5)', class = 'b', ub = 0, nlpar = 'inter'),
           set_prior('cauchy(0, 2.5)', class = 'b', nlpar = 'rest'))

brms_neg <- brm(bestMod_raw_dir,
                prior = prior_neg,
                sample_prior = "yes",
                iter = 2e4,
                warmup = 3000,
                cores = 4,
                data = lopDat_F2SD,
                save_pars = save_pars(all = TRUE),
                backend = "cmdstanr",
                file = "../data/BF/bf_2SD_neg_2e4iter",
                control = list(adapt_delta = 0.99))

ppt has 253 levels, and target has 60 levels


My original goal is to compare two models using bayes_factor(). When I ran the models using 10000 iterations (successfully, after 1 or 2 unsuccessful attempts), bayes_factor() threw a warning:

Warning: logml could not be estimated within maxiter, rerunning with adjusted starting value. 
Estimate might be more variable than usual.

I increased the number of iterations, following a recommendation from this GitHub thread.

>sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=bg_BG.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=bg_BG.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=bg_BG.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=bg_BG.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Sofia
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rstan_2.32.3        StanHeaders_2.26.28

loaded via a namespace (and not attached):
  [1] mnormt_2.1.1         gridExtra_2.3        inline_0.3.19        rlang_1.1.2         
  [5] magrittr_2.0.3       matrixStats_1.1.0    compiler_4.3.2       loo_2.6.0           
  [9] callr_3.7.3          vctrs_0.6.4          reshape2_1.4.4       stringr_1.5.0       
 [13] pkgconfig_2.0.3      crayon_1.5.2         fastmap_1.1.1        backports_1.4.1     
 [17] ellipsis_0.3.2       utf8_1.2.4           threejs_0.3.3        cmdstanr_0.5.3      
 [21] promises_1.2.1       markdown_1.11        effsize_0.8.1        nloptr_2.0.3        
 [25] ps_1.7.5             xfun_0.41            jsonlite_1.8.7       later_1.3.1         
 [29] afex_1.3-0           psych_2.3.9          parallel_4.3.2       prettyunits_1.2.0   
 [33] R6_2.5.1             dygraphs_1.1.1.6     stringi_1.7.12       car_3.1-2           
 [37] boot_1.3-28.1        numDeriv_2016.8-1.1  estimability_1.4.1   Rcpp_1.0.11         
 [41] knitr_1.45           audio_0.1-11         zoo_1.8-12           base64enc_0.1-3     
 [45] bayesplot_1.10.0     httpuv_1.6.12        Matrix_1.6-2         splines_4.3.2       
 [49] igraph_1.5.1         tidyselect_1.2.0     rstudioapi_0.15.0    abind_1.4-5         
 [53] codetools_0.2-19     miniUI_0.1.1.1       curl_5.1.0           processx_3.8.2      
 [57] pkgbuild_1.4.2       lmerTest_3.1-3       lattice_0.22-5       tibble_3.2.1        
 [61] plyr_1.8.9           shiny_1.7.5.1        bridgesampling_1.1-2 posterior_1.5.0     
 [65] coda_0.19-4          RcppParallel_5.1.7   xts_0.13.1           pillar_1.9.0        
 [69] carData_3.0-5        tensorA_0.36.2       checkmate_2.3.0      DT_0.30             
 [73] stats4_4.3.2         shinyjs_2.1.0        distributional_0.3.2 generics_0.1.3      
 [77] ggplot2_3.4.4        rstantools_2.3.1.1   munsell_0.5.0        scales_1.2.1        
 [81] minqa_1.2.6          gtools_3.9.4         xtable_1.8-4         glue_1.6.2          
 [85] emmeans_1.8.9        tools_4.3.2          shinystan_2.6.0      beepr_1.3           
 [89] lme4_1.1-35.1        colourpicker_1.3.0   mvtnorm_1.2-3        grid_4.3.2          
 [93] QuickJSR_1.0.7       crosstalk_1.2.0      colorspace_2.1-0     nlme_3.1-163        
 [97] cli_3.6.1            fansi_1.0.5          Brobdingnag_1.2-9    dplyr_1.1.3         
[101] V8_4.4.0             gtable_0.3.4         digest_0.6.33        brms_2.20.4         
[105] htmlwidgets_1.6.2    farver_2.1.1         htmltools_0.5.7      lifecycle_1.0.4     
[109] mime_0.12            shinythemes_1.2.0    MASS_7.3-60 

I’m no expert, but my guess would be that you are indeed running out of memory. I, too, use systems with only 8GB of ram, and they often crash at the end of the sampling process when I’m running lots of iterations. In my case it is clearly due to the fact that trying to save a very large model object exceeds the memory limit.

1 Like

Thank you for your shared experience, @blokeman! Yes, this is most probably the reason for my crashes as well because, since my last post, I was able to successfully run models with 15000 iterations but never with 20000 iterations. I guess the memory limit is somewhere in between. Also, I realized that more iterations won’t solve my original problem of the Bayes factor’s unusually high variability, as suggested in this discussion.

1 Like