Extract Variational Posterior estimates

Good morning,

How do I proceed to extract only the variational posterior estimates obtained after running variational inference

m <- stan_model(model_code = 'parameters {real y;} model {y ~ normal(0,1);}')
f <- sampling(modelstan)
fit_samples = extract(f)
fit_samples$y[1]

In the code chunk above, I extract the samples obtained with an MCMC using the variational proposal. My question is how to access the mean and the covarianance of this independent proposal, i.e., the outputs of the variational inference algorithm ran in the first place.

Thanks a lot for this
Belhal

The vb function does not return the covariance matrix. You have to work with the draws produced by extract.

Thank you.
What about the variational posterior mean estimate?

Also, you are talking about the function vb, but I am using the function sampling, where the Variational inference is done within before sampling from the variational proposal. Does this function sampling outputs variational posterior Mean and Variance?

Thanks a lot

The sampling function does not do variational inference at all. I think you are confusing Stan with PyMC3. The best way to estimate the posterior mean vector is with the means of the transformed draws from the normal distribution at the optimum.

1 Like

Got it. Yes I will use vb function then.

What you suggest would actually give me the mean and the variance of the true posterior distribution whereas what I want is the mean and variance output by the variational inference (obtained by optimizing the ELBO).

In the simple case above, I agree that both coincides but if you imagine a complicated true posterior (not gaussian) then I could not estimate the Variational posterior mean by the mean of the MCMC drawn…

It should be easy to extract the parameters obtained by variational inference and used as a proposal to draw samples in the MCMC, don’t you think?

Thanks

It is not easy to extract them because they are not returned by the C++. It would not be that difficult to have the C++ return them, but no one is working on ADVI (although some are working on ADVI diagnostics and PSIS). And I don’t know of anyone who would be willing to refactor the ADVI code in order to get out the means and (co)-variances of the variational parameters because those are not useful for approximating posterior distributions.

I agree. Thank you.

Last question: I am not sure about the output of the VB method. Do you confirm that vb returns samples drawn from the posterior approximation (the Gaussian candidate used in the KL optimization)?

If yes then you are right from the beginning, I can calculate rough estimates of mean and variance from these drawn.

Thanks a lot for your patience Ben.

ADVI does ELBO minimization in the unconstrained space, drawing from a multivariate normal (diagonal in the case of meanfield) distribution in the unconstrained space, and transforming to the parameter space defined in the parameters block (in addition to anything in transformed parameters or generated quantities). What comes out is the latter.

1 Like

The whole ADVI code situation is a mess. We’re on the verge of just removing it from Stan altogether because nobody has time to clean it up.

I don’t think it’s that bad. I did read the code and @yuling was able to make some changes. We now have proper diagnostic for ADVI, which is currently implemented for CmdStan in @yuling’s branch. @yuling had some Stan development branch installation problem not related to ADVI, for which @yuling asked for help, but I think it didn’t get solved yet (any updated on that @yuling?) . As soon as that is solved, @yuling makes a pull request and we might have the diagnostic in before StanCon. Already with the current ADVI, 28% of 234 test models work well and the diagnostic has given us additional insight to improve ADVI. ADVI will never give NUTS level accuracy on 100% models, but there will be many cases where it will be useful.

If ADVI code needs further cleaning we can work on it while both @yuling and @anon79882417 are at Aalto in August.

1 Like

I was talking about the code, not the fitting. That does need a lot of cleanup in every aspect from factoring the code to documentation to testing to providing appropriate output formats that include all the relevant output.

Our MCMC code is much better organized, better tested (at the code level, not the statistics level), and better integrated with the interfaces.

I’m leaving the fitting issues and definition of what works well up to you et al.!

I don’t know of any outstanding installation problems for Stan. @yuling—are you still having installation issues?

I agree ADVI code is not well-organized, and in particular not completely compatible with other MCMC code. In my psis branch variational mean and sd are returned as they will be further passed to PSIS. It can be at least called in cmdstan.

It is a little ambiguous because variational inference is conducted in the unconstrained transformed space, so conceptually we can have two quantities that behave like posterior mean (and sd likewise): 1) transforming the variational normal mean back to the actual parameter space, the result of which is biased, 2) sample mean of actual continuous parameters, which is unbiased(with respect to variational distribution) but may suffers from high MC error. From the user’s side, the latter quantity is less important as it can be calculated from extract draws.

4 Likes

Does this mean stan would not support VI or stan would support VI but the codes are needed to refactor? Any plan on VI part?

I use VI (meanfield normal) in stan recently. And I found it very easy to use, and the result looks OK. Of course, it’s much faster than MCMC in my simple hierarchical model with thousands of parameters.