RDS issues in Rstan with parallel session

Just wanted to report an issue I encountered while working with RStan on a Linux cluster.
Running a “job array” where several jobs are using the same Stan model from a file
I got many jobs to terminate with “readRDS” errors, like

“Error in readRDS(file) : error reading from connection”

the problem went away once I turned off the
rstan_options(auto_write = TRUE)

seems like multiple processes were trying to read and write into the same “model.rds”

The better way to do something like this is to

  1. Compile the model once on the log-in node with auto_write = TRUE
  2. Do the sampling on the cluster nodes

This will ensure that all of the cluster nodes are only reading but not writing to the RDS file.

how do I load on a cluster node from pre-compiled model in RDS file to sample from it ?

As long as you give the full path to the Stan program when you call stan or sampling on the cluster nodes, it should find the RDS file in the same directory. This can be a bit tricky if the file systems are separate.

Personally, I prefer not bothering with auto_write and just explicitly compiling the model in advance. Here’s a script that I have to do that:

#!/usr/bin/env Rscript


option_list = list(
  make_option(c('--outputRDS', '-o'),
              help = 'Alternate name of output file'),
  make_option(c('--verbose', '-v'),
              help = 'Print intermediate compilation output',
              action = 'store_true',
              default = FALSE)

opts_args = parse_args(OptionParser(option_list=option_list,
                                    usage = "%prog [options] Stan-model-file"),

opts = opts_args$options
stanFile = opts_args$args

outputFile = NULL
if (is.null(opts$outputRDS)) {
   outputFile = sprintf("%s.rds", file_path_sans_ext(basename(stanFile)))
} else {
  outputFile = opts$outputRDS

myModel = stan_model(stanFile, verbose=opts$verbose, auto_write = FALSE, save_dso=TRUE)
saveRDS(myModel, file=outputFile)

Of course, you’ll need to install the optparse package from CRAN for this script to work as is. On the cluster node, just read in the saved model with the readRDS() function.

Thank you this is very helpful but I’d also appreciate having the last bit
because reading with “save” function does not make much sense to me

It doesn’t make sense because I wrote it incorrectly. Sorry about that. I fixed my post accordingly.

