Hello, This seems like it should be simple and is related to this question. I am trying to run a sensitivity analysis for several parameters using
rstan on an HPC cluster. I am generating multiple simulated data sets and apply four
stan models to each dataset using:
map(seq_along(list.of.simdata), function(x) sampling(model1, data = list.of.simdata[[x]], iter = n_iter, warmup=n_wmp, chains=n_c, seed = 8029, control=list(adapt_delta = 0.99, max_treedepth = 16)))
The result is a list of multiple s
stanfit objects (length = length of
list.of.simdata). I have been trying to use
saveRDS to save each of these lists, but the resulting file is usually more than 6GB and eventually overruns my available disk space. I’ve tried adding the
compress="xz" command to
saveRDS (because this seems to achieve the greatest amount of compression).
Unfortunately, that dramatically increases the length of time to save the file. I eventually need to compare the posterior draws for about 15 parameters to the originally simulated values so all I really need is the draws along with any of the sampler parameters to ensure that I didn’t get any warnings (divergences, BFMI, etc). I’m wondering what the best way is to ensure that I retain the ability to access samples, evaluate sampler performance, and run diagnostics (e.g., traceplots) without exhausting disk space and while retaining the ability to open files on my local machine in a new
Is it better to just extract the draws and sampler info and save them as their own objects? What are the drawbacks to doing that (rather than finding an efficient way to compress the list of
stanfit objects)? Any pointers would be most appreciated.