Stan threading using the sample method: Redistribution of threads when some chains are complete?

Hi all

Just wanted to know how stan redistributes threads when a sample is complete. I have ran 4 chains with 8 threads (2 threads per chain). When a chain is complete, I see that 8 threads are still being used. To me this indicates that the freed up threads are redistributed to running chains? If so, I just wanted to know how these threads are redistributed when some of the chains are completed? I.e. does it divide the number of free stan threads by the number of running chains and distributes these equally (if possible) or is there some other method how this occurs?

The thread pool is all handled through the pooling supplied by the Intel Template Building Blocks (TBB). There’s a thread pool. It’s not clever enough to do smart allocation by number of chains. @stevebronder will know the details.

We use TBB’s threading scheduler to manage nested parallelism and tbh I’ve never looked at the underlying code for how their scheduler works. But in general as tasks are completed and threads are recovered by the scheduler it prioritizes higher level threads in the nested parallelism first and then gives remaining threads to the lower level parallel jobs. It runs the lower level parallel loops in small batches so that when threads become available work can be stolen from other threads. That work stealing is most likely what you are seeing from 1 chain finishing and the same amount of threads are being used

You can check out the tbb docs or code to see info and the code for their thread scheduler

https://oneapi-src.github.io/oneTBB/main/tbb_userguide/Task-Based_Programming.html

1 Like