Best practices for saving multiple large stanfit files

mattwilliamson13 · January 27, 2019, 8:21pm

Hello, This seems like it should be simple and is related to this question. I am trying to run a sensitivity analysis for several parameters using rstan on an HPC cluster. I am generating multiple simulated data sets and apply four stan models to each dataset using:

map(seq_along(list.of.simdata), function(x) sampling(model1,
                    data = list.of.simdata[[x]],
                    iter = n_iter, warmup=n_wmp, chains=n_c, seed = 8029,
                    control=list(adapt_delta = 0.99, max_treedepth = 16)))

The result is a list of multiple sstanfit objects (length = length of list.of.simdata). I have been trying to use saveRDS to save each of these lists, but the resulting file is usually more than 6GB and eventually overruns my available disk space. I’ve tried adding the compress="xz" command to saveRDS (because this seems to achieve the greatest amount of compression).

Unfortunately, that dramatically increases the length of time to save the file. I eventually need to compare the posterior draws for about 15 parameters to the originally simulated values so all I really need is the draws along with any of the sampler parameters to ensure that I didn’t get any warnings (divergences, BFMI, etc). I’m wondering what the best way is to ensure that I retain the ability to access samples, evaluate sampler performance, and run diagnostics (e.g., traceplots) without exhausting disk space and while retaining the ability to open files on my local machine in a new R session…

Is it better to just extract the draws and sampler info and save them as their own objects? What are the drawbacks to doing that (rather than finding an efficient way to compress the list of stanfit objects)? Any pointers would be most appreciated.

sakrejda · January 27, 2019, 11:16pm

If you’re running out of storage extract the subset you need to arrays and write those using saveRDS. You can calculate diagnostics prior to reducing what you keep and save those also as separate objects. As a bonus you reduce load time.

Topic		Replies	Views
Saving stanfit object General	3	5813	September 23, 2017
Is using saveRDS and readRDS with a stanfit object a supported usage? RStan	1	1111	February 23, 2018
Stanfit object size too big for memory General	4	3095	July 1, 2017
Saving executables compiled using RStan Interfaces cmdstan , rstan	5	1442	September 30, 2017
Stanfit object fit inside function explodes in size when saved to .rds General	2	806	March 12, 2020

Best practices for saving multiple large stanfit files

Related topics