I can’t confirm that it hasn’t happened on a Linux + R virtual machine: all I know is that sometimes when I log back into Google’s Cloud engine and screen -r
back to my R session, R has terminated and the remaining code to be run was entered into the Linux terminal.
As far as Windows + R versus RStudio, I don’t think the same issue happens in plain R. I’ve been running some models for coming up on a day now without R shutting down; however, I’m still running into R not doing what I want it to do. I think it has something to do with the way that rstan
is setting up parallel cores.
Current Attempts & Results
I’m fitting multiple 2PL IRT models using brms
, and I’m using a loo_compare()
to compare the various model specifications. As a result, after each model is compiled and sampled, I call add_criterion(..., criterion = "loo")
. Since I’m just wanting to get all the models run right now (and I’ve already fit all the models on a different dataset), I’m following up each add_criterion()
with the next model fit, so the environment looks something like this:
fit1 <- brm(...)
fit1 <- add_criterion(fit1, criterion = "loo")
fit2 <- brm(...)
fit2 <- add_criterion(fit2, criterion = "loo")
and so on for 18 different models. After the first model was fit, the call to add the LOOIC produced the following error output:
Error in serialize(data, node$con) : error writing to connection
Error in serialize(data, node$con) : error writing to connection
The next model then compiled and sampled without any issues, but when it finished, I got 10 of the following warnings:
In for (i in 1:codeCount) { :
closing unused connection # (<-LAPTOP-...:11781)
Then the call to add_criterion(...)
resulted in the same error in serialize
issue. The remaining models all compile and sample (so far) without additional errors or warnings, but every add_criterion(...)
fails with the same double printed serialization error.
Past Attempts & Results
When using Rstudio for the initial trials of these models, I’d experienced the same issues of seemingly random crashes. My experience is that the crashes have nothing to do with the complexity of models but more with the number of models or calls to post-processes that are informed by options(mc.cores = parallel::detectCores)
.
I suspected that this may have something to do with memory demands and starting using gc()
more often to help things, but I found that gc()
also very frequently would result in an Rstudio crash or less frequently would print out the closing unused connection #
warnings. No idea what about this actually works, but I started doing the following when having to run multiple models in Rstudio:
library(brms)
options(mc.cores = 4) #specifically avoiding the call to parallel::detectCores()
fit1 <- brm(...)
options(mc.cores = 1)
save("...file path...")
gc()
options(mc.cores = 1)
fit1 <- add_criterion(fit1, criterion = "loo")
save("...file path...")
gc()
options(mc.cores = 4)
fit2 <- brm(...)
and so on. This seemed to avoid the random crashes, but when I would start to check the models with pp_check()
, the crashes would resume. Again, the crashes seem more linked to the number of times that certain functions are called rather than the complexity of the models. This is very outside of any of my expertise, but it seems like the issue is related to both parallel operations and memory rather than either one independently. I’m not sure why using save(...)
seems to avoid the gc()
crashes, but it did in my experience and typically resulted in the “closing unused connections” warning instead. Similarly, I found that I had to keep manually changing the mc.cores
options and that, anecdotally at least, specifying a specific number of cores rather than relying on parallel::detectCores
extended the time I could have an Rstudio session working before a crash.
Also, for whatever it’s worth, when working on the previous versions of R and the brms
versions for that R, I would only get Rstudio crashes when doing the post processing of models; however, I’ve noticed since updating that I will occassionally get Rstudio crashes between when rstan
is finished sampling and the fitted brms
object is available in the R environment. Using the file = "...file path..."
argument within the brm
call, it seems that those crashes occur before the object is saved as well.