Correlated 2D Gaussian breaks ADVI

anon75146577 · September 28, 2017, 8:03pm

I’m not sure why people are surprised by this. For mean-field Gaussian you’re approximating family is a product of Gaussians on the two axes, which, for example, can’t approximate a narrow Gaussian concentrated around the line y=x.

For the full rank one, I’d expect it to be in the correct place, but the covariance matrix to be too “concentrated”. This is because the KL divergence is an asymmetric measure of “distance” between two probability distributions and in the direction that it is used for VI, it penalises approximations that are too diffuse far more fiercely than approximations that are too concentrated. This leads to a systematic underestimation of variation using VB methods.

tl;dr: VB doesn’t really work, but might get you a central point quickly. Sometimes.

Topic		Replies	Views
Some issues that may arise regarding transformations General	28	1988	November 8, 2018
Parameters way off with variational inference, not sure why Modeling specification , variational-bayes , cmdstanpy	2	585	August 23, 2023
Speed up adaptation with Variational Approximation? Modeling	22	2144	May 16, 2017
Is the posterior from ADVI always normal? Algorithms	6	179	April 7, 2025
Comparing Stan's adaptation phase to that of nuts-rs? Algorithms	20	1903	August 11, 2023

Correlated 2D Gaussian breaks ADVI

Related topics