Help with reduce_sum

mitzimorris · July 31, 2020, 11:29pm

I’ve worked through the long discussions on Discourse - Help with naming threading argument - specifically Aki’s comment Help with naming threading argument - #10 by avehtari which presume a machine with 8 available CPUs. I’m not sure what cores meand - maybe there’s a difference between R and Python? CmdStanPy uses a the concurrent.futures module which handles the calls to the subprocess module.

The sample method used to have args chains and cores - we’ve renamed cores to parallel_chains. The old logic was the if parallel_chains is greater than number of CPUs, set parallel_chains to number of CPUs. The question is whether or not to take threads_per_chain into account as well.

From the discussions here and the CmdStanR GitHub issue, I’m a little unclear what CmdStanR does - was the final decision that if user specifies both parallel_chains and threads_per_chain then that many chains are run in parallel, no matter what the number of available CPUs?

This is somewhat dangerous, because code takes on a life of its own - if someone codes up an analysis and runs it on their 32 core workstation with chains=6, parallel_chains=6, threads_per_chain=5, all is fine. If someone else with a laptop an Intel dual-core processor which really only has 2 CPUs (Intel’s hyperthreading reports that is has 4 CPUs), and runs the script as written, what is the best thing to do?

The PR I’ve submitted tries to adjust down the number of parallel_chains. I’ll use the reduce_sum case study and do some timing experiments on the various machines I have access to.

Topic		Replies	Views
Model with reduce_sum takes too long Modeling	28	1581	December 28, 2020
Using reduce_sum for ODE parallelisation Modeling rstan , ode , compiler , paralellization	24	1826	April 5, 2021
Four chains vs four jobs General cmdstan	28	366	June 19, 2024
Stuck at Warmup iteration with no error : CmdStanR CmdStan techniques , fitting-issues	48	3305	April 21, 2020
Trying within-chain parallelization with reduce_sum increases runtime a lot Modeling paralellization	6	1006	August 2, 2021

Help with reduce_sum

Related topics