Why are 4 chains used?

billdenney · February 12, 2019, 2:44am

Why are 4 chains used as the default with Stan (and many Monte Carlo-based methods)? Specifically, I understand why multiple chains are a good idea (robustness to initial conditions). I can work my way through the fact that 3 chains are better than two at least at a minimal level (you get a median!). But, why 4? Why not 5 or 6? Why not 3?

The only reason I can immediately come up with (with minimal research) is that 3 chains are a good idea, and if you’re running 3 chains on multiple cores, 4 probably doesn’t take more time because you probably have an even number of cores for standard computer science reasons of powers of 2.

bgoodri · February 12, 2019, 3:28am

It is mostly the even number of cores thing and the fact that with Stan the sampling is usually pretty efficient or else you get warnings implying that it is not. So, 1000 warmup followed by 1000 more iterations that you keep on 4 chains can get you an effective sample size of 400 even if you have an effective sample size that is only 10% of the nominal one. And 400 effective draws is good enough for most purposes.

billdenney · February 12, 2019, 9:12pm

Thank you. It’s good to know that my intuition was reasonable.

Topic		Replies	Views
Multi-chain vs single-chain Developers	7	2203	March 7, 2023
Number of cores and number of chains Developers	4	126	January 16, 2025
Limit to number of chains General techniques , performance	2	435	April 18, 2020
In general, should a practitioner set the number of chains equal to the number of cores available? General	2	499	October 22, 2019
Different execution times between chains General rstan	4	560	May 25, 2022

Why are 4 chains used?

Related topics