When running RStan jobs in R on a cluster, is it possible using too many cores will result in insufficient memory?

I am currently trying to run a parallelized RStan job on a computing cluster in R. I am doing this by specifying the following two options:

options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)

The above allocates 48 cores available and the total RAM I have is 180 GB. I always thought in theory that more cores and more RAM was better. I am running very long jobs and I am getting insufficient memory errors in my cluster. I am wondering if I am perhaps not giving enough memory to each core. Is it possible that the 48 cores each are splitting the 180 GB and each core is then maxed out?

If I were to use the 180 GB of RAM and instead had 3 cores, would this get around memory errors? Or is it no matter how many cores I have, the total memory will always be used up at some point if its a long job? Thanks!

1 Like

Hey! Generally, with more cores you will need more memory. I don’t know the specifics (other people around here would know that), but think for example about the numbers of draw that need to be saved, which linearly increases with the number of cores (but other stuff needs memory as well).

This, for example, should run fine.

I saw your other post regarding a related topic, and I would suggest that you try to utilize the number of cores at your disposal through things like map_rect etc. Those kind of things are not that easy to use yet, I’m afraid. There is a tutorial on Richard McElreath’s GitHub iirc, which could be interesting for you.

Cheers! :)

2 Likes

Hello,

here a plot of cores vs memory usage

  • x axis total number of draws * data

  • y memory usage

  • columns number of cores

In this case the memory does not get affected much for HMC


For completeness this is the time gain I am getting

1 Like

Thank you for this! May I ask how many chains you used in your simulation? I now understand that it will only use as many cores as chains, so the simulation done with at least 8-16 cores, which is why the time decreases? Thanks!