About the regularized covariance estimation of the Euclidean metric in HMC hyperparameter

Hi,

I figured out that for the Euclidean metric in HMC hyperparameter, there are choices like (“unit_e”, “diag_e” and “dens_e”), and the choice of the metric influences the convergence of the chains a lot. I read the STAN manual and found out that the “diag_e" and “dens_e” are estimated using some regularization method. I am super curious about what kind of regularization is used?

Thank you!

1 Like

We take a penalized maximum likelihood point estimate. We just shrink the diagonal estimates toward zero and shrink the covariance matrix toward the diagonal. Nothing fancy. There is a bit more information in the reference manual, but the actual values of the shrinkage and regularization are in the code and not user configurable:

In the future, it’d be nice to have a fourth option where we use a low-rank plus diagonal approximation of covariance. The problem with the dense case is that it makes the leapfrog steps \mathcal{O}(N^2) in dimension due to generating a multivariate normal and it requires at least one \mathcal{O}(N^3) operation to Cholesky factor. I believe there may be more efficient low-rank updates we could apply directly to the Cholesky factors, but we’re not doing that now. With low-rank plus diagonal, those times go to \mathcal{O}(N \cdot J) and \mathcal{O}(N \cdot J^2) for rank J \ll N.

1 Like