Question regarding a paragraph about high-level LOO vs. WAIC intuition from an unpublished manuscript (Vehtari and Gelman, 2014)

Hello out-of-sample predictive accuracy experts,

A long time ago, when I came across this line in an unpublished manuscript from Vehtari and Gelman that was formerly archived on (traces attesting to the existence of said manuscript can now only be found on CiteSeer) that highlighted a difference between the WAIC and LOO prediction tasks at a high level:

In practice, when there is a difference between WAIC and LOO as here with large data scaling factors, the modeler should decide whether the goal is predictions for new schools or for these same eight schools in a hypothetical replicated experiment. In the first case the modeler should use LOO and in the second case the modeler should use WAIC.

When I read this years ago in my initial exposure to Bayesian predictive accuracy metrics, I found this to be a intuitive line that helped me process what was going on. However, as I now revisit this, I’m having some trouble zeroing in on which equations and expressions in Gelman et al., 2014 and Vehtari et al., 2017 source the above intuition. In Gelman et al., 2014, I see that the first three terms of the WAIC Taylor expansion match that of LOO, and I’m not sure how the different fourth terms alter the practical purposes of each metric. I’d appreciate it if someone could point me to relevant equations in those or other publications. Thanks!

Edit: supposing I should tag @avehtari for this.

1 Like

That paragraph was a mistake due to some misunderstanding, and you can ignore it. The issue is more complicated than that. It would be great if we could update our old papers, in the same way we can make new software releases.


Thank you for the clarification. In terms of a high-level intuitive comparison between PSIS-LOO and WAIC, is there an updated perspective you would take? Or, would it be accurate to broadly state that WAIC is simply a less accurate metric for out-of-sample predictive accuracy than LOO?

See CV-FAQ How are LOO and WAIC related?

As you write just LOO, it’s good to note that LOO can be computed with different approaches, see CV-FAQ The computational method used to compute leave-one-out predictive distributions.



Is this also true for the following excerpt from page 180 of BDA3?

Cross-validation: For this example it is impossible to cross-validate the no-pooling
model as it would require the impossible task of obtaining a prediction from a held-
out school given the other seven. This illustrates one main difference to information
criteria, which assume new prediction for these same schools and thus work also in no-
pooling model.

1 Like

Yes. WAIC fails silently, and gives an answer which looks useful, but further investigation revealed that the interpretation is not clear. I can edit the online version to correct this. Sorry, for the confusion.

1 Like