Where did loo difference estimates go?

Solomon · March 22, 2019, 2:23am

With the brms version 2.8.0 update, is it no longer possible to get a loo difference estimate for two models? I’ve been playing around with loo(), add_criterion() and loo_compare() and the closest I’m getting is elpd_diff. Same deal for waic.

I miss my loo difference estimates.

Wistfully,
Solomon

paul.buerkner · March 22, 2019, 6:48am

@avehtari didn’t like them anymore ;-)

To bring them back use

print(loo_compare(...), simplify = FALSE)

You also see that the entire matrix is actually returned if you look at str(loo_compare(...)).

avehtari · March 22, 2019, 11:46am

I guess you are looking for something else as elpd_diff is the loo difference.

Solomon · March 22, 2019, 12:01pm

Thank you for chiming in, @avehtari . I don’t think I follow. The elpd_diff appears to be the difference between the two elpd_loo estimates. In my case, that’s -116.1 - -111.8 = -4.3 That difference estimate is clearly in a different metric than what one might get from the looic estimates, in my case 232.1 - 223.5 = 8.6. So if you’re calling the elpd_diff the “loo difference”, then yes, I’m looking for something else, namely, the difference score in the metric of the loo estimates, which appears to be the looic.

Solomon · March 22, 2019, 12:06pm

Thanks for the direction, @paul.buerkner . When I use the print(loo_compare(...), simplify = FALSE) method, I see that I get a wealth of output, including the looic estimates and their standard errors. I can manually compute a difference score for their estimates, but don’t see a clear way to get the standard error for that difference. What am I missing?

paul.buerkner · March 22, 2019, 12:26pm

multiply elpd_diff by -2 and the corresponding standard error by 2.

Solomon · March 22, 2019, 1:29pm

That’s easy enough. Thanks, @paul.buerkner.

I understand these changes in the brms output are in accordance with @avehtari’s recommendations, which seems like a good thing. But from a user perspective, the change seems abrupt. Are they spelled out somewhere in the brms documentation?

paul.buerkner · March 22, 2019, 1:30pm

The changes primarily happend in loo not in brms.

Solomon · March 22, 2019, 1:33pm

Is there a place in the loo documentation that the brms documentation might direct brms users to in order to help them understand the changes?

paul.buerkner · March 22, 2019, 1:36pm

Not sure. Maybe @jonah knows.

Solomon · March 22, 2019, 1:37pm

Okay. Thank you for the networking.

avehtari · March 22, 2019, 1:47pm

I’ll just add that the problem with the previous output was that there were values which were exactly the same except for the multiplier -2. So the output had more to read, but no actual additional information. As we wanted to add more useful information, we decided to drop columns multiplied by -2. At the same time there has been attempts to make brms, rstanarm and rstan loo outputs to be consistent. Sorry for not having enough communication. I have been waiting new rstanarm version, before advertising new loo, but new brms release came first. I don’t personally like multiplying log densities by -2 as it is arbitrary in Bayesian context (it used to make some sense for Gaussian models and maximum likelihood).

Solomon · March 22, 2019, 1:52pm

That all makes sense. Thank you for the clarification.

If no one minds, I’d like to share a link to this exchange on twitter. It might help other confused users.

avehtari · March 22, 2019, 1:58pm

Sure, go ahead. This is public forum and great if you find something useful to share. loo 2.1.0 was recently released, and NEWS mention several changes, but unfortunately it doesn’t explicitly mentioned that something was removed from the default output.

paul.buerkner · March 22, 2019, 2:01pm

Thanks Aki for providing the explanation! Sorry, @Solomon for my brief and barely helpful answers today, I just sent them in the middle of some talks. Perhaps we could add Aki’s explanation somewhere in the documentation so the users can understand the history of our changes.

I had to release brms earlier because it was failing in the R-Devel versions on CRAN and they required me to do a new release.

Solomon · March 22, 2019, 2:09pm

No problem, @paul.buerkner. We all have day jobs.

I think linking to one of @avehtari’s talks is a great idea. I’ve used the link to his GPSS2017 workshop: On Bayesian model selection and model averaging presentation, before. Would the Model assessment, selection and averaging talk from November 2018 be a better choice?

avehtari · March 22, 2019, 6:49pm

Much better as it’s more up to date

Solomon · March 27, 2019, 5:39pm

Now I’ve had some to play with the new brms/loo workflow, it might be helpful to others to see what it can look like in practice. Here I’ll compare fit1 and fit2 from @avehtari and @jonah’s vignette Bayesian Stacking and Pseudo-BMA weights using the loo package, which are themselves based on models from McElreath’s Statistical Rethinking, Chapter 6.

First, load the packages and format the data.

library(tidyverse)
library(rethinking)
library(brms)
data(milk)

milk <- 
  milk %>%
  drop_na(ends_with("_s")) %>%
  mutate(neocortex = neocortex.perc / 100)

Fit the two models.

fit1 <- 
  brm(data = milk, family = gaussian,
      kcal.per.g ~ 1,
      seed = 1)

fit2 <- 
  brm(data = milk, family = gaussian,
      kcal.per.g ~ 1 + neocortex,
      seed = 1)

Here we’ll compute the LOO and WAIC for both models and them save the results with the brmfit objects.

fit1 <- add_criterion(fit1, c("waic", "loo"))
fit2 <- add_criterion(fit2, c("waic", "loo"))

You can access the results like this.

fit1$loo

Computed from 4000 by 17 log-likelihood matrix

         Estimate  SE
elpd_loo      4.4 1.9
p_loo         1.3 0.3
looic        -8.8 3.7
------
Monte Carlo SE of elpd_loo is 0.0.

All Pareto k estimates are good (k < 0.5).
See help('pareto-k-diagnostic') for details.

fit2$waic

Computed from 4000 by 17 log-likelihood matrix

          Estimate  SE
elpd_waic      3.6 1.6
p_waic         1.9 0.3
waic          -7.2 3.2

To compare models by their information criteria, use the loo_compare() function. Since it’ll come in handy in a bit, we’ll first save the results as ic.

ic <- loo_compare(fit1, fit2, criterion = "waic")

Here’s the default output.

print(ic)

     elpd_diff se_diff
fit1  0.0       0.0   
fit2 -0.7       0.6

However, there’s more information available.

print(ic, simplify = FALSE)

     elpd_diff se_diff elpd_waic se_elpd_waic p_waic se_p_waic waic se_waic
fit1  0.0       0.0     4.4       1.9          1.3    0.3      -8.7  3.7   
fit2 -0.7       0.6     3.6       1.6          1.9    0.3      -7.3  3.2

Even with all that output, you’ll notice that the difference score and its standard error are in the elpd metric, rather than the conventional information criteria metric. Here’s a quick way to do the conversion.

cbind(waic_diff = ic[, 1] * -2,
      se        = ic[, 2] *  2)

     waic_diff       se
fit1  0.000000 0.000000
fit2  1.416582 1.153017

Topic		Replies	Views
ELPD clarification General brms	3	448	January 12, 2024
Calculating standard error of difference in LOOIC between models brms loo	3	741	May 18, 2021
Comparing non-nested models using elpd_diff Modeling techniques , loo	2	618	February 9, 2021
Loo_compare elpd_diff output producing NaN General fitting-issues , loo , brms	2	444	February 19, 2021
Compare models with elpd calculated differently? brms loo	2	66	February 14, 2025

Where did loo difference estimates go?

Related topics