Provided that i) the same proportion of iterations are allocated for warmup, ii) the same control parameters are used and iii) that the model fits well, would runtime for e.g., 2000 iterations be 10x that for 200 iterations?

I’m using an HPC to fit a simple multi-level model but on a very large dataset (>1 million observations, 1000s of groups for random intercepts and for random slopes across 5 predictors). I know that estimating the runtime of a model a priori is not really possible, but knowing whether this could be estimated from shorter runs would be incredibly useful for planning / requesting resources from the cluster.

I’d expect this to be more-or-less true for the sampling phase in a well-specified model, but the length of the warmup phase is also used to determine the length of the adaptation schedule, so the relationship between number of iterations and wall clock time could be more complicated there.

Note that if the same proportion of iterations are allocated for warmup, then we would expect that (up to a point) time per sampling iteration tends to decrease as warmup becomes longer.

To understand how long a run will take, you need to know roughly how long each leapfrog step takes and then how many leapfrog steps will be required. The former is usually nearly constant for well-behaved models (but not always, e.g. ODE models where solves are faster in some regions of parameter space). The latter is hard to estimate, because you can’t get a good idea of how many leapfrog steps are required per iteration until the sampler has successfully adapted to the posterior, and that adaptation is so computationally expensive that it’s already an appreciable fraction (sometimes up to half) of the total wall time for the entire process.