Clarifying interpretation of loo_compare output

In the following https://discourse.mc-stan.org/t/interpreting-output-from-compare-of-loo/3380/4 Aki Vehtari gives some simple advice on how to apply the loo_compare() function.

But, predictably for me, I have found a way to doubt myself and make what should be simple complicated.

I have two nested models. The full model is Model A and the simpler model, obtained by removing one predictor from Model A, is Model B.

When I enter them in the following order using this command…

loo_compare(ModelA, ModelB)

…the output I get looks like this.

     elpd_diff   se_diff
Model B    0.0       0.0   
Model A  -10.1       5.4

Now Aki Vehtari said in his answer that “The difference will be positive if the expected predictive accuracy for the second model is higher”. I assume by extension this means that the difference will be negative if the predictive accuracy of the first model is higher.

But I don’t get what are the first and second models?. You’ll notice in my example output that the more complex model is on the bottom row, despite the fact that it was entered into the function first.

So I guess my question is is the first model the first as entered into the function or as reported in the output (i.e. the top row)?

A second question is how does the loo_compare() function decide what model goes on the top or bottom rows? I have found that sometimes the order of rows corresponds to the order the model was entered into the function, but sometimes the order is reversed.

If Aki’s rule is based on the order the models are entered into the function then everything is simple and we can basically ignore the row names in the output (i.e. an elpd_diff/se_diff > 2 means the second model is better and an elpd_diff/se_diff <-2 means the first model is better), however I am not sure if this is the rule.

A little confused.

Model B is relatively speaking better. However, the difference is not even 2SE (5.4 \times 2 = 10.8, which is larger than 10.1) so there’s not much difference here, e.g., if we assume z_{95\%} = 1.96, then,

-10.1 + c(-1,1) * 1.96 * 5.4
[1] -20.684 0.484

i.e., it crosses 0.

1 Like

Sorted in the order with best performing model in the top. The same sorted order is used also if there are more than two models. The best performing model is then used as the common comparison point.

As the posting you referred is more than 2 years old, I also recommend to check CV-FAQ, which has more information also on interpreting se_diff.

1 Like

In addition to what @torkar and @avehtari said there’s more info about loo_compare() (and how the models are ordered) in the Details section in the loo_compare() doc:

If anything is unclear definitely let us know because we’re always looking for ways to improve our doc. Thanks!

Thank you @ torkar. I understand that the difference itself is not large enough to warrant a lot of attention. I am more interested in the direction of effects. So model B is better because it is on the top row of the output? Is that right?

Yes, Model B is the model that all other models are compared against, as Aki wrote, hence the zeros on the first row.

1 Like