Variable running speeds across chains during warmup - possible causes

A lot of the models I’ve been running recently seem to quickly get through the early stages of the warmup, and then seem to suddenly slow down after about 300 iterations.

The speed change seems pretty consistent across chains, it’s not as if one chain is getting stuck or suffering from insufficient CPU.

I’m curious why this might be, are there different parts to the warm up that might explain this speed variation? My models take quite a long time to run so if I can better understand the warmup process and possibly remove anything that might be unnecessarily slowing them down that would be a great help.

Yes! Check out the section about the warm-up algorithm (page 44 onwards on https://github.com/stan-dev/cmdstan/releases/download/v2.19.1/cmdstan-guide-2.19.1.pdf )! Also plot divergences and log10(stepsize) over iterations for a few well-behaved and badly-behaved models. The well-behaved-warmup and badly-behaved-warmup patterns are pretty recognizable visually…

2 Likes

Thanks @sakrejda I’ll check it out.

Something I should have stated in my original post is that the models always seem to eventually converge, with sensible results, and no warnings.

Yeah, that adaptation algorithm is quite good! This stuff (and plotting where it goes while it’s struggling) can help you figure out how to make the model more efficient during warm-up. You can get huge wins in run-time that way but if you don’t need 'em it’s a waste of time.

1 Like

Are there any R functions you recommend I use to help diagnose where the sampling goes when it’s struggling? The one I like to use most is traceplot, but curious if you recommend any others. Alternatively, are there any graphs with shinystan you recommend I look at?