I am working with a model where a location-dependent metric is required to efficiently sample the posterior distribution.
I have my own implementation of the simplified manifold MALA (smMALA) algorithm from Girolami and Calderhead (2011).
The covariance of the proposal distribution in my smMALA algorithm is proportional to the inverse Hessian of the posterior evaluated at the current parameters.
One issue I have come up against is that the Hessian is not always positive-definite, i.e., it sometimes has negative eigenvalues. As a result of this, the covariance of the proposal distribution is not valid.
A simple solution is to add a constant diagonal matrix to the Hessian, e.g. diag(-2*min(lambda) ), where lambda is the set of Hessian eigenvalues. This ensures that the covariance of the proposal distribution is positive-definite.
This solution generally has the effect of decreasing the overall variance of the proposal distribution, sometimes quite significantly.
Are there any better solutions to this problem? And does the Stan RHMC prototype address this issue at all?
The SoftAbs solution that is described in the link Micahel posted is to transform the eigenvalues,
r: lambda -> lambda * ( exp(alpha*lambda) + exp(-alpha*lambda) ) / ( exp(alpha*lambda) - exp(-alpha*lambda) )
= lambda / tanh(alpha * lambda)
Then form a regularized Hessian by taking the product of the original eigenvectors with the transformed eigenvalues,
H = Q * r(lambda) * Q'
Much nicer than the solution I suggested in my original post!
More technically the Hessian is not a proper metric (it transforms in the wrong ways but also it’s guaranteed to be positive definite as you note) and if you regularize then you have to ensure that the regularization is smooth to not corrupt the Hamiltonian trajectories used in Hamiltonian Monte Carlo. The experiments in the Girolami and Calderhead paper focused on problems with convex posterior densities for which the Hessian was sufficiently well-behaved, but this is not typically for the problems for which one would need to appeal to Riemannian methods in the first place.
The SoftAbs metric resolves the smoothness issues which makes it a valid candidate metric and hence serves as the basic for the prototypical Gaussian-Riemannian Hamiltonian Monte Carlo implemented hidden deep in the Stan source code for the moment.
also it’s not guaranteed to be positive definite