Evaluating parallelization performance

yizhang · September 13, 2019, 2:40pm

Similar thing hit me in a different context: if we can control the behavior of some math function according to where the sampler is, there are some performance gain can be made. Like in ODE models sometimes even if I know the problem is non-stiff I have to run it in stiff solver because the sampler stepped into regions the solution becomes stiff.

Within-chain parallelization can be done in various flavors, and currently we are not designing it in a organized way(which is the point of the thread, I guess). There are things easier/better to be done in distributed level, and things better in thread level. Even in a same level we can have multiple tiers. Unless we can reach a design decision there are going to be many iterations in future designs.

Topic		Replies	Views
Parallelizing the sampler (not the model) Developers	22	2532	August 16, 2022
Cross-chain warmup adaptation using MPI Algorithms mcmc	91	5447	July 2, 2021
MPI framework for parallelized warmups Algorithms mcmc	25	2276	December 7, 2019
Multicore Speedups are different between models Algorithms	25	4825	September 11, 2017
Within-chain parallelization idea (maybe crazy) Developers	35	3040	February 24, 2022

Evaluating parallelization performance

Related topics