I was looking more into ADVI (and taking a shot at breaking it up into more sensible pieces…) as part of the output refactor. It struck me that I can’t tell from the ADVI paper why they rely on stochastic gradient ascent (and as an aside, why we don’t just implement stochastic gradient ascent as a …

Why does ADVI use stochastic gradient ascent not LBFGS

sakrejda June 29, 2018, 6:12pm 3

Thanks that’s very helpful. I must’ve missed this in the algorithm just b/c it calls the same gradient calc as everything else. I’ll have to spend more time with it.

Topic		Replies	Views
How does ADVI select mini-batches for stochastic gradient ascent? Algorithms	3	547	April 7, 2021
ADVI / Stochastic quasi-Newton methods Algorithms variational-bayes	5	1332	September 22, 2017
Does ADVI stepsize choice have the same problem with its stopping rule? Algorithms	5	600	December 21, 2020
ADVI / Rats example / Adagrad Algorithms variational-bayes	1	1097	September 17, 2017
Stan Variational Inference Deprecation and Documentation Developers fitting-issues	2	161	April 25, 2025

Why does ADVI use stochastic gradient ascent not LBFGS

Related topics