Variational Bayes runtime and memory usage

lauderdale · January 11, 2018, 11:29am

I am using vb() to fit a large model. The model fits just fine, however rstan takes a very long time between when the console states that it has drawn a sample from the approximate posterior and that it has “COMPLETED” until it actually finishes. During that time, which has been as long as an hour, there is intense CPU activity and very large spikes in memory usage (I noticed because my computer seized up for a bit when the memory pressure became severe). What is stan doing at this point? Why does it suddenly require far more memory than is required at any previous point in the vb() function? I am drawing only a tiny posterior sample of a small number of parameters, that is not what is using more than the 32GB of memory that I have in my computer.

Gianluca · January 12, 2018, 2:52am

I’m not sure if this is related, but I’ve noticed the same thing when fitting a large model using NUTS in pystan. Is it perhaps due to Stan computing some summary statistics on the posterior samples?

bgoodri · January 13, 2018, 1:21am

It reads off the disk, but I don’t think the I / O should be that noticeable.

JulianK · January 14, 2018, 7:44am

I have encountered this before also.

After a bit of experimentation I concluded (perhaps incorrectly) that this occurs when you have large numbers of parameters.

The ADVI algorithm has computational complexity O(N*p) (I think), where N is the number of daya points and p is the number of parameters, whilst the generation of samples has complexity O(p^3) due to the Cholesky decomposition required.

If you have a large number of parameters the runtime for the second operation (what you describe) can be substantial.

Julian

Bob_Carpenter · January 18, 2018, 6:55am

That makes sense. Isn’t it drawing from a multivariate normal approximate posterior? That should be cheap if it’s a diagonal matrix (mean field), but will involve more expensive matrix multiplies if it’s dense.

lauderdale · January 18, 2018, 10:26am

That makes sense, but if that is true, the message to the console is misleading. The lengthy wait comes after the console prints “COMPLETED”. So either that message is being printed at the wrong time, or it is not the draws from the approximate posterior that are taking a long time.

bgoodri · January 18, 2018, 3:46pm

Yeah, completed in this context refers to the optimization not the draws.

JulianK · January 18, 2018, 7:50pm

Is it worth checking that the draws aren’t doing a Cholesky decomposition or something else crazy for the mean field approximation?

I don’t know the code but maybe there’s a bug here.

avehtari · January 19, 2018, 12:59pm

I checked the code. In meanfield there is no accidental Cholesky. All the draws are generated and written to output between the messages “Drawing a sample of size … from the approximate posterior” and “COMPLETED”, and this was reported to be fast. I couldn’t yet figure out what happens after “COMPLETED” has been printed.

avehtari · January 19, 2018, 1:02pm

In advi.hpp logger.info("COMPLETED."); is after the draws.

Krzysztof_Sakrejda · January 19, 2018, 5:25pm

This sounds like a bug but ti’s not going to get anywhere without a reproducible example. Do you have on eyou can share?

avehtari · January 19, 2018, 6:36pm

@lauderdale Do you have generated quantities block? Can you show your code?

bgoodri · January 19, 2018, 6:40pm

Or if you don’t want to show the code to the whole internet, have Doug ping me.

lauderdale · January 23, 2018, 10:09am

There is no generated quantities block. I will try to get a reproducible example up here soon, but at the moment the model is having convergence problems with vb().

Topic		Replies	Views
Performance differences between RStan's VB and CmdStan's variational Interfaces cmdstan , rstan , variational-bayes	3	600	May 23, 2021
Possible to lower memory usage? General	9	2183	April 22, 2022
Summary method slow for large models General	12	2367	May 31, 2021
R memory-conservation strategies with Stan Modeling specification , performance	4	614	February 28, 2021
Fine tuning for polynomial Posteriors Modeling performance	1	38	December 3, 2024

Variational Bayes runtime and memory usage

Related topics