In the following https://discourse.mc-stan.org/t/interpreting-output-from-compare-of-loo/3380/4 Aki Vehtari gives some simple advice on how to apply the loo_compare()
function.
But, predictably for me, I have found a way to doubt myself and make what should be simple complicated.
I have two nested models. The full model is Model A and the simpler model, obtained by removing one predictor from Model A, is Model B.
When I enter them in the following order using this command…
loo_compare(ModelA, ModelB)
…the output I get looks like this.
elpd_diff se_diff
Model B 0.0 0.0
Model A -10.1 5.4
Now Aki Vehtari said in his answer that “The difference will be positive if the expected predictive accuracy for the second model is higher”. I assume by extension this means that the difference will be negative if the predictive accuracy of the first model is higher.
But I don’t get what are the first and second models?. You’ll notice in my example output that the more complex model is on the bottom row, despite the fact that it was entered into the function first.
So I guess my question is is the first model the first as entered into the function or as reported in the output (i.e. the top row)?
A second question is how does the loo_compare()
function decide what model goes on the top or bottom rows? I have found that sometimes the order of rows corresponds to the order the model was entered into the function, but sometimes the order is reversed.
If Aki’s rule is based on the order the models are entered into the function then everything is simple and we can basically ignore the row names in the output (i.e. an elpd_diff/se_diff > 2
means the second model is better and an elpd_diff/se_diff <-2
means the first model is better), however I am not sure if this is the rule.
A little confused.