Using samples from the adaptation / warmup phase in NUTS

@andrjohns did you mean that (2) happens relatively quickly and (1) takes longer? The dual averaging for the step size settles pretty quickly, but the windowed adaptation for the metric is slow (and then the step size needs to be re-updated every time the metric updates). Both the dual averaging and the metric adaptation can only adapt to the posterior itself once the typical set has been found.

@David2 I’m pretty sure there’s no guarantee of detailed balance when warmup is still happening, so even if standard convergence diagnostics look ok (and Stan’s folded r-hats should be sensitive here), my guess is that you’d meet with some skepticism if you were to try to use warmup samples for inference.