Would it make any sense to compute something like Rhat for the info that comes out of the warmup period? I’m thinking primarily about the HMC parameters, but the actual model parameter samples might also be helpful? (I know the samples during warmup are not samples from the posterior, but maybe they contain some info useful for detecting bad adaptation in that dramatic differences between chains suggests warmup explored different regions between the chains and may therefore be insufficient.)
If some thresholds were developed for that, it would permit terminating at the end of adaptation if adaptation was deemed insufficient.
Obviously the very-cool ideas for pooled warmup would thwart this, but thought I’d post the idea anyhow.