Approximating posterior expectations by likelihood expectations


This is more of a general statistical question, but addressing it may give some insight into Stan models, so I appeal to the statisticians in the forum. I realize that the term “likelihood expectations” that I used in the title is a misnomer since the likelihood is generally not even a pdf, but hopefully you get the idea (maybe a better term would be likelihood importance sampling). Basically, I wonder when the expression below will be a good approximation to posterior expectations:

E[\theta \mid y]\approx\frac{1}{ \sum_{i=1}^{N} p(y \mid \theta^{(i)})}\sum_{i=1}^{N} \theta^{(i)}p(y \mid \theta^{(i)})
\textrm{where }\theta^{(i)} \textrm{ are draws from the prior } p(\theta)\textrm{ and }p(y \mid \theta^{(i)})\textrm{ is the likelihood }

Basically, this weighs prior draws by the likelihood function. I tested this is in a very simple model with one parameter, and the estimates I get using the expression above are virtually the same as the true posterior estimates (both the mean and SD).

Any thoughts on when this approximation may fall apart? Thanks.

For finite N, this approximation will fall apart whenever there are insufficient prior draws within the region of “important” posterior probability to reliably characterize the posterior. Note that as the dimensionality of the parameter space increases, the proportion of prior draws that are in regions of high posterior probability quickly gets very small, even if the posterior is not much narrower than the prior.