Hi all,
I am fitting different data to a stan model (using cmdstanR) in parallel on my own PC (using parallel::makeCluster). After much debugging and trying different scenarios, it turns out that when I run the models in parallel there seems to be cross-cluster contamination but only during the sampler phase that sets the initial values, stepsize, and inverse mass matrix. This gives the following error:
Error during model fitting: ‘init’ has the wrong length. See documentation of ‘init’ argument.
When I open 5 different instances of R/Rstudio and run them in parallel manually without the clusters, the same error occurs. However, when I run the models sequentially by adding a sleep.system function with the locally generated clusters, it runs fine.
Because the code itself is quite long (and a reprex for this is a lot of work), I’ll first share some code snippets and potentially relevant information:
The models are pre-compiled. I am using chkptstanr (version .20) to call cmdstanr, however due to testing I don’t think the package is the issue (but perhaps there is some interaction). all output paths are unique, even the base chain names (output_basename) are unique. Each run uses a different seed.
Does anyone have anyone ideas here?
# Packages
library(glue)
library(doSNOW)
library(foreach)
library(chkptstanr)
library(MASS)
library(cmdstanr)
library(dplyr)
library(stringr)
library(brms)
chkpt_stan(model_code = stan_model,
data = stan_data,
iter_adaptation = 150,
iter_warmup = warmup_iters,
iter_sampling = sampling_iters,
iter_per_chkpt = chkpt_iters,
parallel_chains = 4,
threads_per = 1,
chkpt_progress = TRUE,
control = NULL,
seed = seed,
stop_after = dynamic_stop,
reset = FALSE,
path = path_,
output_basename = paste0("chain_", model, ".",
dataset, ".",
run, "."))
## set up parallel backend ---------------------------------------------------
if (hyper_parallel) {
cluster = parallel::makeCluster(
models_in_parallel,
outfile = glue("output/consoleOut.txt")
)
doSNOW::registerDoSNOW(cluster)
}
# Nested approach
foreach(model_id = models,
.packages = c('chkptstanr', 'MASS', 'cmdstanr', 'dplyr', 'stringr', 'brms'),
.errorhandling = "stop") %:%
foreach(dataset_id = 1:N_datasets,
.packages = c('chkptstanr', 'MASS', 'cmdstanr', 'dplyr', 'stringr', 'brms'),
.errorhandling = "stop") %:%
foreach(run_id = 1:N_runs,
.packages = c('chkptstanr', 'MASS', 'cmdstanr', 'dplyr', 'stringr', 'brms'),
.errorhandling = "stop") %dopar% {
run_analysis(model = model_id, dataset = dataset_id, run = run_id)
} # foreach close
Please provide this additional information in addition to your question:
- Windows 11
- CmdStan Version: 2.35.0
- CmdStanr Version: 0.8.1.9000