Calculating elpd manually for observations with high pareto k diagnostic

This is a follow-up post to a previous post: Using loo_moment_matching() with an rjags object

Rather than using loo_moment_match() as a remedy for high pareto k diagnostics, I want to manually calculate the expected log-predictive density (elpd) for problematic observations. This is because the models are fit in JAGS rather than Stan. I will then combine manually calculated elpd estimates with psis-loo estimates to stack some models.

My question is, do I

  1. generate the log-density \log(p(y_i \vert y_{-i})) and take the mean, or
  2. generate the density p(y_i \vert y_{-i}), take its mean as \frac{1}{S}\sum_{s=1}^S p(y_i \vert \theta^{(s)}), and then take the log?

I ask because ‘elpd’ sounds like the mean of the log-predictive density, but I have read elsewhere that elpd is instead the log of the mean of the predictive density (which sounds like log-expected predictive density ‘lepd’).

Does this vignette help Holdout validation and K-fold cross-validation of Stan programs with the loo package • loo ?

It does! It looks like the answer is (2) in my original post; i.e. calculate the expected predictive density and then take its log. Many thanks!

1 Like