Cmd stan 2.18


Can cmd stan 2.18 be used safely? Does it have stable variational inference? When rstan 2.18 will be released?


Yes, no, soon.

1 Like

Thanks. Do you have plans for a stable variational inference? I read lot on ADVI but so far only python folks implemented it.

Stable is a weird word here. We have an implementation of ADVI that can be called (and have had for a long time). We just don’t really think it does a good job approximating the posterior in most cases.We wrote a paper about some tests we did.

I personally do not recommend using ADVI (regardless of where it’s implemented) and I don’t really think pymc3’s strategy of using it to initialize chains is a good one.

Well, by stable I mean working. I have Bayesian neural network and try to run Neal’s code, stan’s equivalent, and variational inference on the same data. The Neal’s code runs quite a while, stan does the same. I thought that variational inference will provide similar results. However I wasn’t able to run it from rstan 2.17 - it simply broke after few iterations or when starting. I thought that this is improved in cmdstan 2.18.

I don’t think anything changed with ADVI between 2.17 and 2.18

Do they still do that? I though they stop using that?

In @yuling’s test of 230+ models it worked for 28% of models, but specifically it works for models with much less parameters than what neural networks have and it’s very likely to fail on multimodal posteriors. Our criteria what working variational inference means is much stricter than what usually is used in machine learning, so when reading ML papers and blog posts they may say VI works, but in that same case would say that it doesn’t work. See the paper @anon75146577 mentioned for more info.

OK. Python NN code didn’t bring me any luck. My suspicion is that it is good for small problems but not for the relative large (500 inputs) I have. I was hoping that stan will provide the decent (expected) prediction. I will still try to run stan’s variational inference but as you said I should be prepared that it won’t give me the expected.

What do you mean by stable?

It’s implemented in Stan. Alp Kucukelbir, the developer of ADVI, implemented it with Dustin Tran’s help and the help of the rest of our integration team.

I’d be curious how the PyMC3 implementation or others differ.

We’re working on the adaptation to make the optimization part of it more stable. Then there’ll be the long hard road of figuring out which transforms and parameterizations work well under the multivariate normal unconstrained parameterization we use.