Multithreading with pystan3

nerpa · July 24, 2020, 11:09am

Hello dear forum,

I am trying to run my Stan model with multithreading, but I saw that it’s a bit complicated with pystan2.18+. From this and other forums, seems like official multithreading is introduced with pystan3 (still Beta version), so I’ve decided to try it. Since my model is really really big, my goal is to parallelize each chain and to submit each chain to multiple cpus. However I couldn’t find anywhere how to implement it in pystan3.
Can someone here point me to the right direction?

Thank you!

ahartikainen · July 24, 2020, 11:52am

We currently are using multiple processes in pystan3.

Do you want to use Stan multithreading functions?

nerpa · July 24, 2020, 11:57am

That’s my goal and I can allocate a generous number of cpus for each chain. In pystan2.19, even if I follow this example, I still have only 4 cpus running (1 per chain). So something doesn’t work right…
Update - I use pystan on CentOS7 linux server.

ahartikainen · July 24, 2020, 12:10pm

So basically you need to define STAN_NUM_THREADS environmental variable and use map_rect function to use multiple cpu.

And in the compile step add

extra_compile_args = ['-pthread', '-DSTAN_THREADS']

nerpa · July 24, 2020, 12:13pm

So there is no way around rewriting my model to do multithreading? Even in pystan3?

ahartikainen · July 24, 2020, 1:00pm

No.

You need to use map_rect or reduce_sum (this is possible with CmdStanPy).

nerpa · July 24, 2020, 1:05pm

ok thanks! One last question - from reading more I saw that some folks suggest to increase the number of chains and to reduce the number of iterations in each chain to speed up. Is it really recommended?

wds15 · July 24, 2020, 1:13pm

between-chain parallelism is always more efficient than within-chain… but you have to go through the warmup for every chain…

nerpa · July 24, 2020, 1:46pm

Yes, I thought so. Thanks so much! Rewriting the model it is then!

nerpa · July 24, 2020, 7:28pm

I hope someone will pick it up - my really large model is stuck in the first set of warmup iterations for 8 hours. Any ideas what it means and how can I solve it?

ahartikainen · July 24, 2020, 9:00pm

How large is your model?

Is it large in parameters or data?

nerpa · July 24, 2020, 9:16pm

Thanks, Ari!
It’s not huge in parameters - I am fitting ~40 parameters. But the data is pretty large - it’s ~1200 observations (including missing data), each with ~10 different tasks with hundreds of trials each. So I guess that’s a lot.
Now the weird part is that in the original model I am looping over trials in each task and to make it more efficient, I vectorized it - BUT it takes more time after vectorization… hmmmm.

nerpa · July 24, 2020, 9:21pm

I see this since 8:40am EST

Gradient evaluation took 0.39 seconds
1000 transitions using 10 leapfrog steps per transition would take 3900 seconds.
Adjust your expectations accordingly!

Iteration: 1 / 1000 [ 0%] (Warmup)
Iteration: 1 / 1000 [ 0%] (Warmup)
Iteration: 1 / 1000 [ 0%] (Warmup)
Iteration: 1 / 1000 [ 0%] (Warmup)

ahartikainen · July 24, 2020, 10:31pm

That sounds interesting.

Maybe ask how to improve your model in a new thread. It is usually a good idea to debug a hard model with others.

Loops in Stan are C++ loops so sometimes vectorization doesn’t help.

nerpa · July 24, 2020, 11:15pm

I wouldn’t even know where to begin - it’s a 600 lines code… I’ll start with re-parametrizing it and with implementing the within-chain parallelization. Thanks a lot!

dherrera1911 · July 19, 2023, 6:34pm

Could you explain how an environmental variable is set? Is it a Python environment variable? Stan environment variable? OS environment variable?

Ezequiel_Alvarez · September 22, 2024, 10:11pm

Hi, this works:

import os
import sys
os.environ['STAN_NUM_THREADS'] = "16"```

Now a question for you, Ari, or whomever....

I cannot make it to work by including

extra_compile_args = ['-pthread', '-DSTAN_THREADS']


in the arguments of model.build() or model.sample() and make it work.   It tells me that

TypeError: build() got an unexpected keyword argument ‘extra_compile_args’

or

ValueError: {‘json’: {‘extra_compile_args’: [‘Unknown field.’]}}


respectively.

Someone knows how to make it actually copmile and sample with many threads?
(of course that I've the model with all the 'reduce_sum()' structure, that already works in cmdstan.  But I need it to work in pystan.

Thank yoiu!

Ezequiel_Alvarez · September 22, 2024, 11:02pm

Hi!

Can you plese clarify this statement: reduce_sum can be used with pystan3? Or just map_rect can be used with pystan3?

Thank you!

Topic		Replies	Views
Chain parallelization with Stan in Slurm General paralellization	2	532	August 16, 2023
How to specify number of threads in pystan3.7 Modeling	2	313	September 22, 2024
Cmdstanpy: multithreading issues (threads_per_chain) CmdStan cmdstanpy	2	509	December 13, 2023
Multithreading with map_rect takes more time Modeling	7	684	July 27, 2020
Trouble with PyStan 3 and python multiprocessing PyStan pystan	9	2303	October 27, 2021

Multithreading with pystan3

Related topics