Did i implement reduce sum correctly?

Bob_Carpenter · June 19, 2024, 9:44am

First, I’d check that you’re getting similar answers across reduce-sum runs as you do across runs without it. It’s generally considered easier to optimize a correct program than to debug an optimized one, or as Knuth is rumored to have said, “premature optimization is the root of all evil.”

This is often true if you use our random initializations, which are uniform between -2 and 2 on the unconstrained parameter scale. It’s not uncommon to see as much as a factor of 2 or more difference in speeds for different seeds. If you reduce the range from (-2, 2) to (-0.5, 0.5), it can often stabilize things at the risk of missing outlier modes, failing to diagnose bad mixing, etc. This is usually OK to do—@andrewgelman, for example, is urging us to use less diffuse initializations (which seems to be a reversal of decades of advice using people to use more diffuse initializations to debug poor mixing).

Reduce-sum is only likely to provide a big speed boost when the amount of work done on each thread (or process if using MPI) dominates the communication cost. For example, it’s almost never worth doing this for a simple GLM, but it’s almost always worth doing with nested ordinary differential equation models or if you have to do a bunch of matrix solves. All of the code inside your computations seems to be relatively simple arithmetic.

Topic		Replies	Views
Can't get reduce_sum to help with model runtime Modeling	7	446	December 19, 2023
Reduce_sum results in much slower run times, even for large datasets Algorithms paralellization	6	1556	March 17, 2022
Reduce_sum performance Modeling	5	926	May 22, 2020
Model with reduce_sum takes too long Modeling	28	1703	December 28, 2020
Multilevel hurdle model -- no performance increase with reduce_sum() Modeling performance	5	572	September 12, 2021

Did i implement reduce sum correctly?

Related topics