Operating System: Linux
Interface Version: Rstan 2.16.2, cmdstan 2.17
Hi, I have been fitting models with a fairly large number of parameters (~40,000). I am not having a problem fitting the models, they just take a lot of draws. My problem is that the draws take up a lot of space! I have some good hardware. Storage is not an issue, but given memory constraints I can only run 4 chains using Rstan, even though I have 16 cores.
This seems to be because Rstan is preparing to load all the chains into the fit object after sampling is done. In my setting I would rather run lots of chains and worry about building the fit object later (maybe I will use virtual memory, or just look at the chains for certain subsets of parameters). Fortunantly, cmdstan affords exactly this. I can run 16 chains no problem!
The downside to cmdstan for me is losing Rstan’s friendly compilation interface which automatically recompiles the model if necessary, doesn’t expose me to a bunch of makefiles and so on.
I can’t seem to find a way to save the compiled stan program in the Rstan interface. Am I missing it?
You can call
stan_model to compile the Stan program and then call
sampling, but I don’t think that is going to help you much.
If you call
pars = character() and
sample_file as some path, then you are basically doing CmdStan. Then you have the problem of how to read the draws off the disk with the available memory. The
read_stan_csv function does not have an option to read a subset of the parameters.
Thanks. That should do what I want. I think I can hack around the limitations of read_stan_csv. Would it be cool if I add a pars= argument to read_cmd_stan?
Yeah and an include flag. The read_stan_csv function (should, and probably will for rstan3) take in a stanmodel so that it knows the dimensions of things. Then it would be easier to include / exclude containers of parameters.
Sounds good. It turns out that when I set pars=character() Rstan still does the huge allocation. I guess this is a bug.
In my case I can put `pars=c(‘param1’), and the allocation is much smaller and I should be able to run all the chains.
I will still need to hack read_stan_csv to fully examine all my parameters in the end.
Thanks again bgoodri.
It might be
pars = NULL or
pars = NA or something. I never use that feature, so I forget. It is probably documented.