Stan threading using the sample method: Redistribution of threads when some chains are complete?

Garren_Hermanus · April 17, 2024, 2:24pm

Hi all

Just wanted to know how stan redistributes threads when a sample is complete. I have ran 4 chains with 8 threads (2 threads per chain). When a chain is complete, I see that 8 threads are still being used. To me this indicates that the freed up threads are redistributed to running chains? If so, I just wanted to know how these threads are redistributed when some of the chains are completed? I.e. does it divide the number of free stan threads by the number of running chains and distributes these equally (if possible) or is there some other method how this occurs?

Bob_Carpenter · April 23, 2024, 6:31pm

The thread pool is all handled through the pooling supplied by the Intel Template Building Blocks (TBB). There’s a thread pool. It’s not clever enough to do smart allocation by number of chains. @stevebronder will know the details.

stevebronder · April 23, 2024, 7:26pm

We use TBB’s threading scheduler to manage nested parallelism and tbh I’ve never looked at the underlying code for how their scheduler works. But in general as tasks are completed and threads are recovered by the scheduler it prioritizes higher level threads in the nested parallelism first and then gives remaining threads to the lower level parallel jobs. It runs the lower level parallel loops in small batches so that when threads become available work can be stolen from other threads. That work stealing is most likely what you are seeing from 1 chain finishing and the same amount of threads are being used

You can check out the tbb docs or code to see info and the code for their thread scheduler

https://oneapi-src.github.io/oneTBB/main/tbb_userguide/Task-Based_Programming.html

Topic		Replies	Views
How to define STAN_THREADS? Developers	7	1095	November 21, 2021
Reduce_sum cores, chains, threads Interfaces cmdstanr	13	1792	May 28, 2020
Cmdstanpy: multithreading issues (threads_per_chain) CmdStan cmdstanpy	2	518	December 13, 2023
Four chains vs four jobs General cmdstan	28	212	June 19, 2024
Optimal num_stan_threads when using multiple chains General performance	5	1903	May 30, 2019

Stan threading using the sample method: Redistribution of threads when some chains are complete?

Related topics