Hi, I have a log density with 30-50 dimensions. Though the normalizing constant of this density is intractable, the density is convergent. The density is fairly regular, being unimodal and asymptotically normal. The gradient and Hessian are both well-defined and can be computed analytically. Evaluating the log density and gradient is not too expensive, and evaluating the Hessian using numeric differentiation is still feasible.
To do inference about this probability density function, I tried to use Stan NUTS (10000 warmup and 10000 samples) and compute some relevant percentages of hypothesis values (like H0: beta=0). From many runs of NUTS using different random seeds for tge same model, the mean values of drawn samples are relatively stable, but the computed percentages of hypothesis values seem to be only stable at 0.01 level. The quality of kernel density estimation plots of NUTS samples also vary. Around 50% of the density plots look close to normal distributions (as my model is asymptotically normal), and others may have some wierd shapes that deviate from normal distributions.
As far as I know, NUTS is gradient based, sharing some similarities with the gradient descent method in optimization problems. There’s another class of Newton and quasi-Newton optimization approaches using both the gradient vector and the Hessian matrix. It seems that using Hessian eliminates the need to choose a proper stepsize, which is difficult and less accurate.
So I wonder if there are any Hessian based Monte Carlo sampling methods that does not need to tune or adapt the stepsize, maybe only need to adapt the covariance of multivariate gaussian noise? Maybe for relatively regular densities and small number of dimensions like my case, using Hessian for sampling could improve inference accuracy with acceptable computational costs?
Special thanks Stan developers!