This topic is a follow-up of the discussion in today’s Stan weekly meeting. What I am interested in is a general shape of the credible interval (CI) of posterior samples in the unconstrained space. Is it reasonable to make an assumption that, in most cases, we can find a hyper-ellipse that covers and is mainly covered by the majority of posterior samples in the unconstrained space.
Let’s take dimension = 2 for example. In the following three plots, I use black circles to represent “posterior samples”, the red ellipse is estimated based on sample covariance. The “posterior samples” in those plots are artificial and the plots are just for illustration. According to the plots, the first two examples satisfy the assumption
While the third example, obviously, violates the assumption.
I am wondering what type of models might have a boomerang shape of the CI in unconstrained space? What might lead to a violation of the assumption? Is it common in Bayesian modeling?
Perhaps a more interesting question would be, how often a Variational Bayesian method using Gaussian approximation fails in practice. Of course a model that can be well approximated by Gaussian also has the 95% CI of posterior samples in an ellipse shape. So the assumption I am interested in should be more general than the condition for having reliable results through Variational Bayesian method like ADVI.
Any thoughts, ideas or interesting examples are welcome. Thanks!
Hi Lu, I’m afraid I can’t help much with your actual question, but might I suggest you call this final example shape a boomerang rather than a banana, because boomerangs are more fun and are longer than bananas - just like that shape?!
Thanks, I am more interested in boomerangs. But since I am working on fast adaptation algorithm, I think it is always good to know tricky models. I know the 8-schools example, and I have some examples with multimodel posteriors.
Sorry for the long delay. In case it’s still interesting, one fairly straightforward class of boomerangs involves models that are close to multivariate normal on the constrained scale (especially if they are a long and skinny multivariate normal), but that are declared with a constraint that requires a highly curved unconstraining transformation. For example
I think this is a simple but beautiful example. This shows that the posterior distribution of parameters defined through any nonlinear transformation of Gaussian can be far from Gaussian. Such transformation is common in Bayesian modeling. So I guess there will be more examples. Thank you so much!
@Lu.Zhang where I can read more of this! Well one I dea that I had some time ago was maybe functional quantiles might be useful to approximate the CI in Variational methods. But I need to grow up the idea a bit further.