Non-centering for mean field variational inference


#1

I have a naive question on algorithms & centering: I’ve understood why non-centering is useful for HMC, but I am curious if the same rationale necessarily holds for mean field ADVI, which as far as I understand will simply ignore correlations among parameters. It seems obvious that the ELBO would be lower compared to an equivalent non-centered model, but does that imply that the maximization of ELBO suffers?


#2

The same rationale helps. With non-centering the posterior of the parameters is closer to independent normal.

and if the posterior is close to independent normal it works just fine.

See Figure 5 and discussion in “Yes, but Did It Work?: Evaluating Variational Inference” https://arxiv.org/abs/1802.02538


#3

Thanks for the reference & comments. I will search arxiv next time before posting here 😅(an idea for a Discourse plug in perhaps)

Figure 5, lower left figure, it seems the combination of centered & non-centered cover more of the parameter space covered by NUTS, than either alone. If I understand correctly, this is irrelevant, because the non-centered variant is closer to the true posterior?


#4

That figure is bit overcrowded, but yes what matters which one is closer and with PSIS we can further correct when estimating various expectations.