Hi,
Thank you for reading my first post! Please let me know if you have suggestions on writing a question better.
Summary
I am running a long brm() single-chain model on a newly-purchased PC (Windows 11). The plan is to later combine 4 chains into one model. The model compiles and runs fine with fewer (240) iterations in my tests, but when I run the model with full length (~50k iterations with 100 thining), the session crashes without any R messages. The error code “0xc0000005” points to memory access violation, and the faulting module path seems to point to the .dll file that contains the compiled stan model in the temporary path.
I am not sure how to troubleshoot next and I wonder if this is a hardware issue. The tower I purchased is can be returned for free by Oct 27 (in a week). I wonder if exchanging a tower would solve the problem.
Thank you all in advance!
The error log on Event Viewer:
Faulting application name: Rscript.exe, version: 4.43.22307.0, time stamp: 0x67c18dc3
Faulting module name: file7d9c48935de5.dll, version: 0.0.0.0, time stamp: 0x68eb1d49
Exception code: 0xc0000005
Fault offset: 0x00000000000a667d
Faulting process id: 0x7D9C
Faulting application start time: 0x1DC3B2659201BEB
Faulting application path: C:\PROGRA~1\R\R-44~1.3\bin\x64\Rscript.exe
Faulting module path: C:\Temp\test\RtmpIJ61uK\file7d9c48935de5.dll
Report Id: 604c4c83-9879-4ce0-a77e-97f4b2eb79c4
Faulting package full name:
Faulting package-relative application ID:
Model:
I have ~41k rows of data and 37k sites with random intercepts. I know it is a complex model and the iteration is longer than usual. See “Backgroud” for more explanation.
brms_mod2 <- brm(
y ~
t2(time, var1, by = category, k=5, m=1) +
t2(time, var2, by = category, k=5, m=1) +
category +
s(time, by = var3, k=5) +
s(time, by = var4) +
s(time, by = var5) +
(1 | siteID),
data = df_std,
family = gaussian(),
prior = c(
prior(normal(0, 5), class = "b")
),
chains = 1,
threads = threading(4),
seed = 20015,
iter = 51000,
warmup = 1000, # default warmup = floor(iter/2)
thin = 100
)
Hardware:
- Processor Intel(R) Core(TM) Ultra 9 285 (2.50 GHz)
- Installed RAM 96.0 GB (95.5 GB usable)
- System type 64-bit operating system, x64-based processor
I am completely new to WIndows and I don’t know if this provides new information on “boost” related to this post:
The processor I purchased is called “Dell XPS 8960 Desktop 4TB SSD 96GB DDR5 RAM Win 11 Pro (Intel 14th Generation Core i9-14900K Processor with Turbo Boost to 6.00GHz, 96 GB RAM, 4 TB SSD) Business PC Computer XPS8960, Graphite Black”
Session Info:
R version 4.4.3 (2025-02-28 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8
time zone: America/New_York
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rstan_2.32.7 StanHeaders_2.32.10 brms_2.23.0 Rcpp_1.1.0
[5] tictoc_1.2.1 dplyr_1.1.4
loaded via a namespace (and not attached):
[1] Matrix_1.7-2 bayesplot_1.14.0 gtable_0.3.6 compiler_4.4.3
[5] tidyselect_1.2.1 stringr_1.5.2 parallel_4.4.3 gridExtra_2.3
[9] scales_1.4.0 lattice_0.22-6 coda_0.19-4.1 ggplot2_4.0.0
[13] R6_2.6.1 Brobdingnag_1.2-9 generics_0.1.4 distributional_0.5.0
[17] backports_1.5.0 checkmate_2.3.3 tibble_3.3.0 pillar_1.11.1
[21] RColorBrewer_1.1-3 posterior_1.6.1 rlang_1.1.6 inline_0.3.21
[25] stringi_1.8.7 S7_0.2.0 RcppParallel_5.1.11-1 cli_3.6.5
[29] magrittr_2.0.4 rstantools_2.5.0 grid_4.4.3 rstudioapi_0.17.1
[33] mvtnorm_1.3-3 lifecycle_1.0.4 nlme_3.1-167 vctrs_0.6.5
[37] tensorA_0.36.2.1 glue_1.8.0 QuickJSR_1.8.1 farver_2.1.2
[41] codetools_0.2-20 bridgesampling_1.1-2 pkgbuild_1.4.8 stats4_4.4.3
[45] abind_1.4-8 matrixStats_1.5.0 tools_4.4.3 loo_2.8.0
[49] pkgconfig_2.0.3
Background:
I originally wanted to run the multi-chain model as one job, but I ran into the error:
Error in serialize(data, node$con, xdr = FALSE) :
error writing to connection
I couldn’t solve the error after trying many suggestions online (check makevars files, resinstalling packages, reducing to 1 thread per chain), but I noticed that my tests (shorter iterations) for single-chain models could run without this error. So I proceeded to run single-chain models and plan to combine them later.
My goal is to parallelize the job so I can run it efficiently, but now even with very few cores, the long model would still crash. I tried using chkpt_brms() but I run into compilation issues and it was hard to debug in my experience (no offense to the developers). In my short tests, the speed is 20 threads > 16 theads > 8 threads > 4 threads.
I read that brm models usually don’t need more than a few thousand iterations. My model has longer iterations because the data I deal with is very noisy and the model structure is complex. Fewer iterations do not provide sufficient convergence.
Things I’ve tried
- I tried to run the model in a separate temporary path using the following commands in PowerShell, but the error persists.
$env:TEMP="C:\Temp\test"
$env:TMP="C:\Temp\test”
& "C:\Program Files\R\R-4.4.3\bin\Rscript.exe" "C:\Users\myname\Downloads\project\brms1_mod.R" *> "C:\Users\myname\Downloads\project\brms1_mod.log"
- I tried to turn the PC off and on, and the first job I ran had a slightly different memory violation error pointing to the tbb.dll file, but the file.info() in Rstudio seems to show this file is not corrupted (or I could be wrong here). The sequential jobs show the error pointing to the temp .dll file that seems to contain the compiled stan model.
file.info(system.file("lib/x64/tbb.dll", package = "RcppParallel"))
size isdir mode
C:/Users/myname/AppData/Local/R/win-library/4.4/RcppParallel/lib/x64/tbb.dll 1035562 FALSE 666
mtime
C:/Users/myname/AppData/Local/R/win-library/4.4/RcppParallel/lib/x64/tbb.dll 2025-10-01 17:24:45
ctime
C:/Users/myname/AppData/Local/R/win-library/4.4/RcppParallel/lib/x64/tbb.dll 2025-10-01 17:24:45
atime exe
C:/Users/myname/AppData/Local/R/win-library/4.4/RcppParallel/lib/x64/tbb.dll 2025-10-11 22:30:47 no
-
Since the error points to the .dll file that contains the stan model, I tried precompile and save the stan model, and then load the model in my Rscripts and use rstan::sampling() directly. The same error persists.
-
I tried to reduce the number of threads from 20 to 4, but the same errors persists.
-
I used to run similar-length models (both single and multi-chain) on a cluster and never encountered this problem.
-
After browsing the forum, I have tampered with package versions and reinstalled the brms package and its dependencies with the code below. Since the short test is fine, I think the package versions should be consistent. You’ll notice that I am not using the newest R version. I tried the newest version of R first and it didn’t work, that’s why I reverted back to the R version I used successfully on cluster (by now I am convinved the package version and dependencies are not the problem, but I could be wrong).
install.packages(c("BH", "StanHeaders", "Rcpp", "RcppEigen", "RcppParallel", "inline", "loo", "pkgbuild", "rstan"))
In a similar attempt, I tried reinstalling rstan from the updated repo following this post R brms compilation error - DLL initialization, the error persists.
remove.packages(c("StanHeaders", "rstan"))
install.packages("rstan", repos = c("https://stan-dev.r-universe.dev", "https://cloud.r-project.org"))
- I am not very experienced with Windows systems. I tried to run the multi-chain model with Linux (Unbuntu) VM on this machine but still run into crashing problems with a different erro message. I haven’t tested long single-chain models on the VM yet (the job on VM runs slower than locally in Windows, so I went with trying single-chain models on Windows). I guess this error stems from the same issue I am posting about.
Error in x@mode :
no applicable method for `@` applied to an object of class "NULL"
In addition: Warning message:
In mccollect(jobs) : 3 parallel jobs did not deliver results
- I took this PC to my institute’s IT. They reinstalled the operation system and they said “the operating system is working fine after reinstallation”. After that, the model still crashes for the same reason.
Thank you all very much!