I recently posted about a problem using combine_models
with moment_match when running chains separately, combining them using combine_models
, and running loo
with moment_match
. However, I’m running into a different problem when the chains are run on a separate computer after the fix from that issue, even with the recompile = TRUE
option.
Based on the error message, it seems loo
is grabbing the file path to the original model fit, which points to the location of the file on a different computer, resulting in a crash. The example code I provide below uses cmdstan as a backend, but the issue occurs regardless of whether the cmdstan backed is used. Since I understand it is a pain in the neck to find another computer to run the files, I’ve temporarily provided fits here. The folder contains four model fits, each representing one chain.
#each model is fit in parallel, with which_chain
# being incremented for each chain fit
brm_fit <- brm(
count ~ zAge + zBase * Trt + (1|patient),
data = epilepsy,
family = poisson(),
chains = 1,
cores = 1,
seed = 022624 + which_chain,
save_pars = save_pars(all=TRUE),
backend = "cmdstanr",
threads = threading(4),
file = file.path(savdir, paste0("brm_fit_c_",which_model))
)
Then models are combined and loo with moment_match is run.
library(brms)
library(loo)
brm_c1 <- readRDS("/brm_fit_c_1.rds")
brm_c2 <- readRDS("/brm_fit_c_2.rds")
brm_c3 <- readRDS("/brm_fit_c_3.rds")
brm_c4 <- readRDS("/brm_fit_c_4.rds")
brm_fit <- combine_models(
brm_c1,
brm_c2,
brm_c3,
brm_c4
)
brm_fit <- add_criterion(
brm_fit,
"loo",
moment_match = TRUE,
recompile = TRUE,
cores = 24
)
Here is the error message:
Recompiling the model with 'rstan'
Recompilation done
Automatically saving the model object in '/work/c/clayson/mmre_crash/brm_fit_c_1.rds'
Error in gzfile(file, mode) : cannot open the connection
In addition: Warning messages:
1: Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.
2: Found 1 observations with a pareto_k > 0.7 in model 'brm_fit'. It is recommended to set 'reloo = TRUE' in order to calculate the ELPD without the assumption that these observations are negligible. This will refit the model 1 times to compute the ELPDs for the problematic observations directly.
3: In gzfile(file, mode) :
cannot open compressed file '/work/c/clayson/mmre_crash/brm_fit_c_1.rds', probable reason 'No such file or directory'
These paths /work/c/clayson/mmre_crash/
point to my user directory on a cluster, not to paths on my local machine. This is where I think the issue is, but I could be wrong. Is there an input I’m missing that needs changing?
Thanks for any help!
Peter
- Operating System: OS 14.3.1
- brms Version: 2.20.4