a) You want to look at convergence of MCMC to the stationary (aka equilibrium) distribution. There are lots of ways to measure that. In cases where you can prove geometric ergodicity, you get exponential decay in total variation distance (i.e., the maximum error in calculating an event probability). Another approach is to look at convergence of square error in expectations. You can also do this empirically by monitoring the trace plot of the log density function. When it asymptotes and then starts bouncing around, you’re there. I really like this paper for a good high-level and not too mathematically intense intro from Roberts and Rosenthal:
Section 3.4 covers geometric ergodicity. Since you asked for math, here are two papers about geometric ergodicity for HMC:
b) We don’t have warmup stopping criteria in Stan. We run HMC/NUTS and hence leapfrog, during warmup. By default, Phase I of warmup runs 75 iterations (when there are 1000 warmup iterations), because we’ve found that works well empirically. Sometimes it takes a little longer. It’d be better to control this somehow, but we haven’t tried yet.