Facebook just put out a paper on “Newtonian Monte Carlo” posting it here if anyone has an opinion on it. From the abstract:
We present Newtonian Monte Carlo (NMC), a method to improve Markov Chain Monte Carlo (MCMC) convergence by analyzing the first and second order gradients of the target density to determine a suitable proposal density at each point. Existing first order gradient-based methods suffer from the problem of determining an appropriate step size. Too small a step size and it will take a large number of steps to converge, while a very large step size will cause it to overshoot the high density region. NMC is similar to the Newton-Raphson update in optimization where the second order gradient is used to automatically scale the step size in each dimension. However, our objective is not to find a maxima but instead to find a parameterized density that can best match the local curvature of the target density. This parameterized density is then used as a single-site Metropolis-Hastings proposal.
As a further improvement on first order methods, we show that random variables with constrained supports don’t need to be transformed before taking a gradient step. NMC directly matches constrained random variables to a proposal density with the same support thus keeping the curvature of the target density intact.
We demonstrate the efficiency of NMC on a number of different domains. For statistical models where the prior is conjugate to the likelihood, our method recovers the posterior quite trivially in one step. However, we also show results on fairly large non-conjugate models, where NMC performs better than adaptive first order methods such as NUTS or other inexact scalable inference methods such as Stochastic Variational Inference or bootstrapping.
I’m bad at math, but it looks like they need the analytical derivatives for each proposal distribution but can still just use autodiff for the models log prob calculation?