Help with memory issue? - "Error in read_cmdstan_csv(files = self$output_files(include_failed = FALSE)"

Hello,

sorry if this has been discussed elsewhere (any pointers appreciated).

I am working with some large models (or in this case, a large generated quantities block). What is the best practise for saving & loading? At present, we are saving using m$save_object(filename), and then loading using readRDS(). Until recently, this was working fine, but we now seem to be having some intermittent errors. For example, if I load my model and call m$draws() it works fine. But if I run m$draws() a second time, I get an error:

my_model <- readRDS("test1_0.model")
sim <- my_model$draws("Q", format = "df") 

sim <- my_model$draws("Q", format = "df") 
Error in read_cmdstan_csv(files = self$output_files(include_failed = FALSE),  : 
  Assertion on 'files' failed: File does not exist: '/var/folders/n4/y27qcvjj55zcvy93d64x00b40000gq/T/Rtmpu4zne4/FoMo1_0-202503311038-1-30eeb5.csv'.

This is making my processing workflow very clunky.

Any suggestions on what I could do differently?

Thanks

It looks like the m in m$save_object(filename) is fit, not a compiled Stan program?

I would suggest recompiling rather than saving compiled models. It’s much more portable.

I would suggest saving draws in .csv format (compressed if you’re short of space).

That way, nothing depends on R’s binary formats.

I don’t know what /var/folders/ is, but you need to put these things into a directory that’s not a temp directory that will be cleared out.

My guess is that the issue here is related to the temporary files that CmdStanR uses as the default for saving the CSVs.

A few tips:

  • For large models I would definitely recommend specifying output_dir to avoid the output being written to a temporary location. If you later don’t want to save the CSV files that end up in output_dir (e.g. if they’re really large) you can delete them after successfully reading everything into R and saving an R object. But make sure you’ve saved an R object first. Alternatively, you can avoid saving the R object since you now have the CSV files in a permanent location. Specifying output_dir just gives you the option to choose.

  • In the main CmdStanR vignette there’s a section on saving model objects that shows some other options besides using the save_object method. Basically the save_object method makes sure everything necessary is read into memory in R from CSV and then just calls saveRDS on the fitted model object. The important part is making sure things are read into memory, after that you can use any tools available in R for saving objects in various formats. The vignette gives an example of using the qsave function, which is much faster than saveRDS.

That’s the temporary directory that R uses. CmdStanR will save files there unless a directory is specified by the user.