Multiple chains and posterior exploration

ldeschamps · September 24, 2019, 5:53pm

Hi!

I read in another thread that one could run more chains for a lower number of post-warmup iterations to accelerate posterior sampling.

But pushed toward the absurd, I have a strange feeling : all things being egual, would a joint posterior distribution explored as well with 200 chains computing 10 post-warmup iterations than 2 chains computing 1000 post-warmup iterations?

My intuition says no, for we need multiple chains to ensure that one is not stationary because stucked in a particular region of the posterior space.

I guess that there should be mathematical demonstrations of that kind of properties. I hope not to ask for a trivial answer, but I do not have enough background to find the information by myself :)

Lucas

bgoodri · September 24, 2019, 5:58pm

You wouldn’t want 2 post-warmup iterations for anything like Rhat that calculates within-chain variance. In general, for a fixed number of total draws, having more chains and fewer draws per chain gives you a better chance of discovering that some of the chains are problematic due to difficulties with the posterior geometry. I don’t think the overall result is going to have better mixing until we pool adaptation information across chains.

andrewgelman · September 24, 2019, 9:20pm

Depending on the computing setup, we should not necessarily think of total #iterations or even total #leapfrog steps as a constant. If you have access to parallel computing, the alternative to 2 chains with 1000 iterations each, could be 100 chains with 1000 iterations each.

Bob_Carpenter · September 27, 2019, 5:28pm

@ldeschamps was suggesting 10, which is the number @betanalpha cited previously, but even that may be problematic for reasons @betanalpha mentions—not being able to diagnose stuck chains effectively, which will result in a biased posterior.

ldeschamps · September 27, 2019, 7:44pm

Thank all for the answer :)

My question was kind of conceptual, if one want to reduce wall time by reducing iterations and multiplying chains, what would be the trade-offs to consider. Enough iterations to produce viable diagnostics and to be able to spot stuck chains are two great suggestions!

I guess one suggestion would be to use map_rect instead, but multiplying chains might be more interesting if one have a complex model to fit with relatively low number of observations, or if information transfer among cores is too expansive (how could it?).

betanalpha · September 27, 2019, 8:12pm

Excellent way of putting it!

The important question is then “how long does it take to produce viable diagnostics”? Diagnostics like Rhat are iteration hungry – it takes a good number of effective samples to be able to resolve differences in the chains, which is why Rhat often misses pathologies. Diagnostics like divergences are more sensitive, but you still need reasonably long chains to get enough divergences to identify where in the model the pathology is manifesting.

Ultimately you need to run each chain long enough to get reasonable expectation estimates, so by the time the diagnostics are really robust the chain will almost be long enough for your inferential goal! There is some wiggle room, however, and there is potentially opportunity for moderate speed ups with 2-10 chains. Any more than that should be considered only for improving diagnostics (more opportunities to randomly fall onto a pathology early), not speeding up inferences.

Topic		Replies	Views
In general, should a practitioner set the number of chains equal to the number of cores available? General	2	525	October 22, 2019
Comparing models with different "chain" and "iteration" paramaters Modeling fitting-issues , loo	1	610	January 23, 2022
Valid to use more chains to increase posterior sample size? Modeling	7	1371	September 10, 2017
Limit to number of chains General techniques , performance	2	451	April 18, 2020
Is having 4 chains with length of 5000 the same as having 8 chains with length of 2500? Modeling	8	1659	January 28, 2022

Multiple chains and posterior exploration

Related topics