Interpreting elpd_diff - loo package

Benambridge · September 22, 2017, 10:02am

OK thanks! I’ve never really written any code at all (other than specifying models for off-the-shelf R packages), but I will try my best.

What I was asking above (and sorry for being unclear again!) is how bad would it be to compare the two models using non-pointwise WAIC (or DIC); i.e., the value that comes with the model summary:

Log-likelihood at expected values: -7114.01
Deviance: 14228.02
DIC: 14823.37
Effective number of parameters (pD): 297.68
WAIC (SE): 14828.32 (170.9)
pWAIC: 292.09

I understand it’s far less accurate than the pointwise version, but is it “good enough” for comparing between two models which should be very different?

avehtari · September 22, 2017, 10:09am

That value is the sum of pointwise values. There is no non-pointwise version of DIC/WAIC (in theory there could be, but they would fail so likely that just forget that).

Benambridge · September 22, 2017, 10:33am

Right - so can’t I just compare the two models on the sum of the pointwise values

WAIC=(as.numeric(WAIC(Both_M)))-(as.numeric(WAIC(Both_NO_CA_M)))
WAIC
[1] -6.837828

and interpret this like an AIC value? I know it’s not optimal - but is it a reasonably good approximation, or totally wrongheaded?

avehtari · September 22, 2017, 10:35am

Totally wrongheaded!

I dislike all information criterion as they are so easily hiding the original assumptions Akaike made, as this discussion also illustrates.

Benambridge · September 22, 2017, 10:42am

OK thanks!

jonah · April 7, 2018, 4:15am

Update: this is on its way with the new loo and rstanarm packages we’re releasing next week (both already on GitHub). There will be an example at help("loo", package = "rstanarm") when it comes out.

LisaMuehlheim · November 8, 2020, 2:00pm

I know this is a very old post, but I am stuck on this very problem.
Apologies if this is formulated in a really unwieldy way. I have two (related) questions

Does this effectively mean that using LOO-CV is not well suited for comparing two models that differ minimally (e.g. with vs. without a predictor of a main effect) but that have multiple observations within clusters ?

Related to this - I have models:

m1: reactiontime ~ condition + (1|Participant)
m2: reactiontime ~ 1 + (1| Participant)

Condition is a factor with two levels: Baseline and Distraction

The theory predicts that participants in the Distraction condition will react slower.

In order to investigate this, can I just “use [m1] and look at the marginal posterior of the effect” , e.g. in a psycholinguistics journal? Or do I “need” to do LOO-CV comparison?

avehtari · November 9, 2020, 3:03pm

You can also start a new thread and refer to an old post.

It is well suited if you are interested in the difference in predictive performance and leave-one-out approximates your prediction task well. You may need a different structure in cross-validation if you are, for example, interested in predicting jointly for a future data arriving in clusters.

I don’t have any information what psycholinguistics journals expect. If you are interested in the magnitude of the effect and not in how well either model predicts the reactiontime then looking at the posterior is sensible. As you have just one unknown condition parameter, looking at the marginal can be informative as the posterior dependency with participant effects is likely to be small.

Topic		Replies	Views
Loo comparison in reference to standard error General loo	10	2963	May 1, 2018
Interpreting output from compare() of loo Modeling loo , interpret-results	7	4078	March 27, 2024
Quick examples of loo() interpretation Modeling loo	11	1556	July 3, 2020
SE of elpd_loo - loo package Modeling loo	2	775	July 6, 2018
Sivula, Magnusson, Matamoros & Vehtari (uncertainty in elpd_loo comparison): newbie questions Modeling loo	12	483	August 8, 2023

Interpreting elpd_diff - loo package

Related topics