Chain Parallelization in Stan

ptheguy · July 26, 2019, 7:43pm

Hi,

I read here that within-chain parallelization is not supported in Stan due to the mentioned reasons. However, when I run my inferences in Stan I often notice that multiple different chains are getting processed concurrently. Does this mean Stan supports parallelization between chains? If yes, is this done automatically, or does the user need to add additional arguments during building the model or compiling the model?

I’m asking because I’d like to parallelize my code among the chains if possible. I’m running on a cluster, which is based on Slurm. I’m essentially wondering if I can be as relaxed as simply issuing the Slurm flag --cpus-per-tasks=10 and have Stan take care of the parallelizng the chains, or I need to do something more involved.

jjramsey · July 26, 2019, 8:03pm

I think this is somewhat outdated. IIRC, map_rect handles within-chain parallelism.

It’s not so much parallelism between chains, since each chain is independent of the others. There’s no communication between the chains.

It’s a little more involved, but not by that much. If you aren’t bothering with map_rect, then you can tell your Stan implementation to create as many chains as you have cores available. That may involve, say, passing $SLURM_CPUS_ON_NODE to Stan somehow, which will depend on which implementation you use, e.g. RStan, PyStan, CmdStan, etc.

ptheguy · July 26, 2019, 8:06pm

Oh I see, thank you. Instead of passing slurm flags, can’t I just use 4 chains in Stan and request 4 CPUs from slurm?

What I want to achieve is 4 chains running concurrently on 4 CPUs.

jjramsey · July 26, 2019, 8:14pm

Sure, that would work. The one catch is that you’d have to manually make sure that Slurm and Stan were using the same number of CPUs.

ptheguy · July 26, 2019, 8:17pm

Yeah of course! But I don’t plan to alter that value often. So, for my own edification, Stan automatically recognizes the CPU resources available to it and distributes independent chain processes across them, with a 1-to-1 correspondence? That’s quite amazing!

jjramsey · July 28, 2019, 5:20pm

I think it’s actually the underlying OS that does that. AFAIK, the Stan implementations RStan and PyStan can put each chain on its own thread, and it’s the OS’s responsibility to figure out what threads go on what CPU cores. Typically, though, there will be 1 chain per core in practice.

ptheguy · July 28, 2019, 6:42pm

I see, thanks for that. Last question: I should be requesting to run on the parallel partition of slurm with cpus-per-task=4 right?

jjramsey · July 29, 2019, 11:59am

I’m not sure, since I have no experience with slurm, myself. I presume that it’s similar to other cluster schedulers I’ve used, i.e., I write a job script that uses special comments and environment variables and is submitted to the scheduler via some command-line utility.

Bob_Carpenter · July 30, 2019, 2:37pm

I’m not sure why this got flagged, but I’m guessing it’s by accident. It shows up in my to-review and I’m not sure whether giving a thumbs-up agrees with flag or a thumbs-down agrees with flag, so I’m afraid to touch it.

Help?

ptheguy · July 30, 2019, 3:21pm

Hi,
Was this an inappropriate or spam question? It was meant for my education and I posted under general so to not take anyone’s time immediately.

stevebronder · July 31, 2019, 3:06pm

Idt so at all!

From the below stackoverflow, assuming you are calling parallelism with rstan or pystan

sbatch --ntasks 1 --cpus-per-task 24 #script_here

Though I haven’t used slurm in a long time

Bob_Carpenter · September 13, 2019, 1:16pm

No. We’re OK with any kind of question.

Topic		Replies	Views
Chain parallelization with Stan in Slurm General paralellization	2	532	August 16, 2023
Within-chain parallelization not working with cmdstanr on linux server General cmdstanr	14	1037	November 10, 2021
Map_rect spawns too many threads than requested Modeling rstan , performance	13	807	January 25, 2021
Multithreading with pystan3 General	17	1270	September 22, 2024
Cmdstanpy: multithreading issues (threads_per_chain) CmdStan cmdstanpy	2	509	December 13, 2023

Chain Parallelization in Stan

Related topics