Purpose of log_sum_weight


#1

In the file base_nuts.hpp I see a lot of occurrences of the expression

log_sum_weight = math::log_sum_exp(log_sum_weight, H0 - h);

It is used in the recursive construction of binary trees and is initiated to have a value of 0.

double log_sum_weight = 0; // log(exp(H0 - H0))

I think this has to do with Neal’s window method and we are trying to calculate the probability of acceptance of the subtree taken as a “window”.
However, since exp(H0- h) = pi(h)/pi(H_0) and adding this term along the trajectory gives log( exp(h_0-h_1) + exp(h_1-h_2) + … + exp(h_{k-1}-h_{k}))
where h_0,…,h_k are the states in the trajectory aka the leaves of the subtree
I don’t see how it gives log( exp(-h_0) + … + exp(-h_k)) which is what we need to calculate the probability of acceptance of the subtree. What was I missing?


#2

This is one for @betanalpha — he’s been traveling a lot, but maybe including an @ with his name will get his attention. Usually it helps to put something more informative in the title, like “base_nuts.hpp” or “MCMC sampler” or something.


#3

The current algorithm of dynamic HMC is described in detail in https://arxiv.org/abs/1701.02434, especially the appendix. In particular Stan does not apply a Metropolis acceptance procedure but rather samples from all of the states in a completed trajectory with probability proportional to w_n = exp(-H(q_n, p_n)).

The log_sum_weight aggregator is instead used for adaptation. We want the step size to be large enough so that we can simulate a trajectory without too many steps but small enough that the trajectory is reasonably accurate and the above weights don’t diverge. The optimal step size, however, will vary from iteration to iteration.

In https://arxiv.org/abs/1411.6669 we showed that for the naive static, Metropolized HMC algorithm where you propose only the last step of a trajectory there is a universal range of optimal average Metropolis acceptance probability which allow us to dynamically tune the step size in adaptation. This analysis doesn’t immediately carry over to the dynamic HMC case so we have to make the heuristic jump that it will work well enough given a reasonable acceptance probability proxy.

log_sum_exp is the log of the average acceptance probabilities of each state in the completed trajectory if we were to hypothetically consider Metropolis proposals. This is the proxy we use for adaptation.