LOO/WAIC Model Difference Credibility Intervals

PsychPhD · September 4, 2017, 4:07am

Hi,

I was wondering if it was possible to derive/estimate credibility intervals for the LOO/WAIC ELPD difference between models?

Symmetric frequentist 1.96*SE confidence intervals don’t quite seem appropriate when comparing Bayesian models!

Cheers,
Andrew

avehtari · September 4, 2017, 7:41am

Yes.

Given Gaussian model of the differences, weakly informative prior, and n>20 the sufficient statistics for the Bayesian posterior of the expected difference are practically same as frequentist standard error. Even if the distribution of differences is not Gaussian, in most cases with large n the distribution of the expected differences is close to Gaussian (CLT). Thus you can interpret the SE given by loo package as a measure of Bayesian posterior uncertainty.

It would be possible to use other models, too, and I’ve used also a non-parametric Dirichlet distribution model (aka Bayesian bootstrap), but in many cases the differences are not that big compared to other sources of error. Whatever the interpretation is, there is a complication as pointwise elpd_i’s are not independent and it’s difficult to model that dependency and thus the uncertainty estimates are not perfectly calibrated. See also my comments in the thread Interpreting elpd_diff - loo package

It seems it’s possible to improve over the simplest Gaussian model of the differences, but with small to moderate n, the problem of modeling accurately the tails of the difference distribution remains (and with large enough n, Gaussian model works well enough).

Note that we could get well calibrated estimate for the distribution of the differences if we would know the true model (which is the assumption made also in the frequentist hypothesis testing). We are using cross-validation specifically in those cases where we don’t know the true model, suspect that the models we have might be quite far away from the true model, and we value the robustness in case of model misspecification.

Topic		Replies	Views
Interpreting elpd_diff - loo package Modeling loo , interpret-results	47	13391	November 9, 2020
Loo comparison in reference to standard error General loo	10	2655	May 1, 2018
LOO and bayes_R2 (seem to) contradict posterior predictive check Modeling loo , posterior-predictive , brms	14	1028	October 14, 2022
Comparing non-nested models using elpd_diff Modeling techniques , loo	2	476	February 9, 2021
Feature request: other loss functions in loo General loo	3	497	February 18, 2020

LOO/WAIC Model Difference Credibility Intervals

Related Topics