I’ll preface with the caveat that it might well be possible to find sufficient speedup by optimizing your Stan code to avoid your problem entirely. I know you’ve recently developed some fairly well-optimized code (e.g. Help with vectorizing for loops), but also that you’ve asked several more recent questions. If you can avoid this problem by optimizing your code, do that!
Additionally, for what it’s worth, I’ve always managed to convince university clusters to install cmdstan
, and in your position I would pursue that until it became obvious that it wasn’t going to happen.
With that said, you can in general break down the estimation into several shorter runs, but only after warmup is complete. Once warmup is complete, you just need to extract the step-size, the inverse metric, and the last iteration, and you can re-start sampling with warmup turned off, explicitly passing the last iteration as inits, and the step-size and inverse metric as algorithmic parameters.
Note that because treedepths are typically much deeper early in warmup than later during sampling, it is not impossible that a chain that takes 15 days to complete spends 10 days in warmup. If you cannot fit warmup into your time limit, then things get tricker, but can still be made to work. The trick is to run a shorter warmup that does fit into the time limit, followed by one sampling iteration. Then, extract the inverse metric, the step-size, and the posterior sample, and pass these back to a new run with warmup still turned on, but with a longer adapt_window
. To get this right, you’ll need to understand how the windowed phase of adaptation works and choose a reasonable adapt_window
based on how many warmup iterations you’ve been able to run so far.