Estimating normalizing constant of posterior

Hi all,
For some ongoing research on variational inference, I’m interested in getting an estimate of the entropy of the posterior,

\mathcal H(p) = - \mathbb E \log p(z \mid x) = - \mathbb E \log p(z, x) + \log p(x).

The first term on the R.H.S can be estimated via MCMC. For the normalizing constant, a bit of reading got me to methods such as bridge sampling. There is an R package that conveniently builds on top of rstan (bridgesampling: An R Package for Estimating Normalizing Constants | Journal of Statistical Software) and offers different bridge sampling methods. Another path I’ve dug less into is SMC methods.

I’m using cmdstanr, and unfortunately, converting this to an rstan_fit object seems to break the bridge sampler. I believe this is related to this issue (Error when using on saved models · Issue #7 · quentingronau/bridgesampling · GitHub).

I wanted to ask in general if people had experience estimating normalizing constants after doing MCMC with Stan. How reliable is bridge sampling and is there a straightforward way to apply it on top of Stan?

Thank you for your help!

1 Like


@alphillips did some of this in his phd but my broad conclusion was that i preferred smc to bridge sampling. Does that help?

For the relatively simple models I explored in On the normalized power prior - PubMed, bridge sampling performed quite well.

Both bridge sampling and path sampling GitHub - yao-yl/path-tempering: continuous tempering by path sampling should be able to compute the normalizing constant. The latter one would work with cmdstanr.

One bonus for VI is that, in any methods you mentioned, bridge sampling, path sampling, and SMC, you need some base measurement. The Bridge sampling package uses a version of Laplace approximation for this purpose. If you already have built a VI approximation, it could be used as a better approximation, and thereby result in a more accurate p(y) estimation.


This is a great point. Not that difficult to modify that code.

To give some additional context: I’m studying the entropy loss when using VI (think multivariate generalization of variance shrinkage). Much of the analysis assumes the target is Gaussian. I would like to empirically check some results on non-Gaussian targets (e.g. horseshoe models) and ideally get an estimate of the target’s entropy.

I’m concerned about relying on a Gaussian proposal to get an estimate of the entropy, since the whole point is to study the non-Gaussian case. I suppose there is no way around using a good proposals.

I agree this is a very interesting idea.