Rstan vector size error

Hi,

I am running a Stan code with rstan. When I tested the program and set iter = 6000, warmup = 3000 the program could work properly. Then I would like to see whether the result will change if I increased the number of iterations so I set iter = 15000, warmup = 5000 but this time the following error message showed up:

Error: cannot allocate vector of size 5.6 Gb
11.
unlist(sss2, use.names = FALSE)
10.
.local(object, …)
9.
extract(x, permuted = FALSE, inc_warmup = FALSE, …)
8.
extract(x, permuted = FALSE, inc_warmup = FALSE, …)
7.
as.array.stanfit(object)
6.
as.array(object)
5.
throw_sampler_warnings(nfits)
4.
.local(object, …)
3.
sampling(sm, data, pars, chains, iter, warmup, thin, seed, init, check_data = TRUE, sample_file = sample_file, diagnostic_file = diagnostic_file, verbose = verbose, algorithm = match.arg(algorithm), control = control, check_unknown_args = FALSE, cores = cores, open_progress = open_progress, …
2.
sampling(sm, data, pars, chains, iter, warmup, thin, seed, init, check_data = TRUE, sample_file = sample_file, diagnostic_file = diagnostic_file, verbose = verbose, algorithm = match.arg(algorithm), control = control, check_unknown_args = FALSE, cores = cores, open_progress = open_progress, …
1.
stan(file = “simulation.stan”, data = time_data, refresh = 0, cores = 8, iter = 15000, warmup = 5000)

I am wondering whether this is a problem for my laptop?

How much memory do you have (ram, not storage)?

Depending on the model, the amount of ram required can be fairly large. You have 4 chains, each with 20000 iterations (80000 samples), that times the number of parameters in the entire model that are saved. So if you have 50 parameters, you have 4 million numbers to hold in memory. That is not including the amount of memory required to hold the data itself, and the autodiff computations, and the trajectory (which, I assume, is negligible compared to the amount of sample storage required). It adds up.

It’s possible that you just don’t have enough ram, depending on the model.

Some solutions include:

  1. Buy more ram.
  2. Thin your chains.
  3. Don’t use so many iterations. Imo, if you need > 5000 samples per chain in Stan for convergence, there is something poorly identified in the model. I rarely use more than 2000 per chain (including warmup).
  4. Track fewer quantities in your stan call. If you have a lot of auxillary parameters (like latent scores, or the normalized-random effects used in the non-centered parameterization), then don’t include them in the pars statement. Record only the quantities you’re interested in.

Edit: I should add, these solutions are probably in reverse-order of preference. So, Tracking fewer quantities and using fewer samples are probably your best options. As a last resort, you may need more ram.

2 Likes