I’m using importance sampling, via the loo package, to correct for an approximation of the likelihood. Using log ratios, I get (hopefully) improved Monte Carlo estimators, using
log_ratios <- fit$draws("log_ratios")
psis_fit <- psis(log_ratios, r_eff = relative_eff(log_ratios))
mean <- E_loo(parm_draw, psis_fit, log_ratios = log_ratios, type = "mean")$value
In addition to the mean, I compute sample estimators of the variance and the quantiles. In my particular example, I trust the approximate likelihood and expect the original Monte Carlo estimators and the IS estimators to be in good agreement. This is true in all cases, except for the variance.
Whatever is going on, the column title should be changed. This is obviously not the posterior variance of the parameters given the 90% posterior intervals.
Also, we should always be reporting on the sd scale, not the var scale.
The good news is that the IS didn’t really change anything as far as I can see, which makes sense given the very low k-hat estimates.
Also, here’s are internal functions used to calculate this (w are the IS weights, not on log scale). Nothing jumps out at me as obviously wrong, but I’m a bit pressed for time at the moment so I could be overlooking a bug here (or in the code that wraps these functions):
I couldn’t quite follow that code, but given it’s dividing by r_eff, it’s not variance. From the code, it looks like an importance sampled estimate of squared standard error.
P.S. Calculating variance with x - mean(x) can be unstable due to catastrophic cancellation if the value of the variable is much larger than its standard deviation. Having said that, I think Stan’s calculations suffer from the same problem and it’s only a problem with edge cases.
I agree that’s what it looks like. It’s been years now so I actually forget the original intent, but hopefully @avehtari will remember and can tell us if it’s just a naming issue or if we’re not computing what we originally intended.
I hadn’t thought about that before but yeah that makes sense!
Oops. It’s no computing what we originally intended. It’s not just the r_eff part (which by default has no effect as by default it’s 1), but the sum(w^2 * r) part is clearly from the variance for the mean part. .wvar should return
@charlesm93 I think you said you wanted to use this in a workshop and I’m not sure when we’ll submit this to CRAN (hopefully soon), but the fix is now merged and available by installing the master branch from GitHub: remotes::install_github("stan-dev/loo"). So if we don’t get it on CRAN in time for your class you can have them install from GitHub with that one line of code.