"Text file busy" Parallel

Dong_Liang1 · September 8, 2022, 8:08pm

In a simulation study, I tried to compile the same code 10 times in parallel.

The following error occurred from each of these 10 compilations:

Error in process_initialize(self, private, command, args, stdin, stdout, â€¦:
! Native call to processx_exec failed
Caused by error in chain_call(c_processx_exec, command, c(command, args), pty, pty_options, â€¦:
! cannot start processx process ‘/local/users/dliang/Fall2022/BCSM/Analysis/Codes/BC2k10_ref_1’ (system error 26, Text file busy) @unix/processx.c:611 (processx_exec)
—
Backtrace:

1. cmdstanr::cmdstan_model(“BC2k10_ref_1.stan”)*
1. CmdStanModel$new(stan_file = stan_file, exe_file = exe_file, â€¦*
1. local initialize(…)*
1. cmdstanr:::model_compile_info(self$exe_file())*
1. withr::with_path(c(toolchain_PATH_env_var(), tbb_path()), ret ← wsl_compaâ€¦*
1. base::force(code)*
1. cmdstanr:::wsl_compatible_run(command = wsl_safe_path(exe_file), args = “info”, â€¦*
1. base::do.call(processx::run, run_args)*
1. (function (command = NULL, args = character(), error_on_status = TRUE, â€¦*
  10. process$new(command, args, echo_cmd = echo_cmd, wd = wd, windows_verbatim_â€¦
  11. local initialize(…)
  12. processx:::process_initialize(self, private, command, args, stdin, stdout, â€¦
  13. processx:::chain_call(c_processx_exec, command, c(command, args), pty, pty_options, â€¦
  14. | base::withCallingHandlers(do.call(“.Call”, list(.NAME, …)), error = function(eâ€¦
  15. | base::do.call(“.Call”, list(.NAME, …))
  16. | base::.handleSimpleError(function (e) â€¦
  17. | local h(simpleError(msg, call))
  18. | processx:::throw_error(err, parent = e)
  Execution halted

The code is very basic, just run in parallel.

mod_ <- cmdstan_model("BC2k10_ref_1.stan")

Operating System: Ubuntu
CmdStan Version: 2.30.1
Compiler/Toolkit: D.K.

Dong_Liang1 · September 12, 2022, 5:50pm

Hello,
A workaround of this problem is manually coping the stan code to a unique name.

## batch_ : batch processing id 
file.copy("BC2k10_ref_1.stan",paste0("BC2k10_ref_1",batch_,".stan"))
mod_ <- cmdstan_model(paste0("BC2k10_ref_1",batch_,".stan"))
file.remove(paste0("BC2k10_ref_1",batch_,".stan"))

But this generates many copies of the same compiled program under the working directory.

Bob_Carpenter · September 14, 2022, 9:52pm

Thanks for following up with a hint, @Dong_Liang1. I have a few follow up questions if you don’t mind.

How you were running the cmdstan_model() function call in parallel? Was it using some built-in R functionality or were you calling an R script in parallel from the shell?
If you want parallelism, can you just compile the model once and use it in all of the parallel calls?
Is there some reason the built-in parallelism in cmdstanr isn’t enough for what you need?

Dong_Liang1 · September 15, 2022, 12:40pm

The parallel run was driven in a shell file named “sim”.

#!/bin/sh
echo "Base->" $1
echo "End->" $2
echo "Rep->" $3
for i in $(seq $1 $2);
do
  echo batch $i;
  R CMD BATCH "--vanilla --slave --args $i $3" BC2k10_ref_1.R sim1_b${i}.Rout  &
  sleep 1
done

Ten batches were then run, each with 99 replicates.

./sim 1 10 99

The R script “BC2k10_ref_1.R” looks like this.

args_ <- commandArgs(trailingOnly = T)
batch_ <- as.integer(args_[1])
rep_ <- as.integer(args_[2])
fn_ <- paste0("sim1_b",batch_,".rData")
cat("Batch",batch_,"R=",rep_,"saving to ",fn_,'\n')

## compile the stan code
file.copy("BC2k10_ref_1.stan",paste0("BC2k10_ref_1_",batch_,".stan"))
mod_ <- cmdstan_model(paste0("BC2k10_ref_1_",batch_,".stan"))
file.remove(paste0("BC2k10_ref_1_",batch_,".stan"))

I don’t know how to compile the program offline and reuse it in parallel calls, or how to use the built-in cmdstanr to run multiple data sets.

Thanks
Dong

andrjohns · September 15, 2022, 1:35pm

To double-check, do you have a single Stan model that you’re attempting to pass multiple datasets to, or do you have multiple stan models and multiple datasets?

Dong_Liang1 · September 15, 2022, 1:57pm

I am attempting to pass multiple datasets to the same stan model. It’s a fishery population dynamic model originally coded in admb.

andrjohns · September 15, 2022, 3:28pm

Can you try performing the parallelism using the compiled model in purely R? For example:

# Compile stan model once
mod <- cmdstanr::cmdstan_model("BC2k10_ref_1.stan")

# Load all datasets as a singe list of datasets:
dataset_list <- list(...)

# Use furrr package for parallel evaluation
library(furrr)
plan(multisession)

parallel_results <- future_map(dataset_list, function(dataset) {
  mod$sample(
    data = dataset_list
  )
})

Dong_Liang1 · September 15, 2022, 3:53pm

Yes, I think that will avoid this “Text file busy” issue.

Topic		Replies	Views
Using cmdstanr on Sharcnet cluster leads to processx errors CmdStan	1	652	January 10, 2023
Cmdstanr processx_exec Error CmdStan	5	2515	April 30, 2024
Segmentation fault: `cmdstan::command` can not parallel using `std::thread` Developers	15	1366	August 20, 2019
Backend Errors when fitting many models in parallel Modeling	4	1062	March 21, 2022
Cmdstanr Access Denied - Processx failure CmdStan	12	998	June 25, 2024

"Text file busy" Parallel

Related topics