Any way to speed up warmup?

I have a model that spends nearly half the time on the first 100 samples. For example,

Chain 1:  Elapsed Time: 1077.08 seconds (Warm-up)
Chain 1:                126.284 seconds (Sampling)
Chain 1:                1203.36 seconds (Total)

Given the current version of Stan, without pooled warmup, is there any way to speed up overall runtime?

Warmup estimates the variance of the parameters? Could it help to provide better starting values?

1 Like

What helps is to make smart guesses about the posterior sd and then scale the parameter such that the posterior sd is roughly 1. You can’t know that in advance, but a proxy suffices.

The other thing is ensuring a good posterior geometry which hinges on your parametrisation. Explore the pairs plots for that.

Finally, you can save the mass matrix and have a look at it (get_adaptation_info in rstan). In cmdstan you can even reuse it.

5 Likes

I tried get_adaptation_info. Do I understand correctly that If any of these values are far away from 1.0 then the corresponding parameters are badly scaled?

I’ve enjoyed success with this idea using cmdstanr. I run the model once to figure out approximate values for the diagonal mass matrix. Subsequent runs can complete in half the time!

6 Likes

Just for completeness for people searching for this later:
Another trick that is easy to carry out but only helps in some specific scenarios is to restrict the treedepth to just above what it uses after adaptation. This is because during adaptation it sometimes spends a lot of time on high treedepths because it is not adapted yet, where the same adaptation could be reached faster by sampling more iterations with lower treedepth. So the adaptation may need to be run for more samples when using this trick, but should sample faster at the beginning (in those specific cases where the trick helps).

Hopefully this trick will be obsolete when the new “campfire” adaption routine comes out.

3 Likes

Sounds great! Looking forward to it!