Loo_compare vs averaging/weighting via stacking or pseudo-BMA weighting

avehtari · February 4, 2021, 7:10pm

That information has been updated in a paper

Tuomas Sivula, Måns Magnusson, and Aki Vehtari (2020). Uncertainty in Bayesian leave-one-out cross-validation based model comparison. arXiv preprint arXiv:2008.10296

which is also listed as a reference in CV-FAQ 15: How to interpret Standard error (SE) of elpd difference (elpd_diff)

You may also benefit from discussion in another thread

You can ignore this, it’s just for those who want to reproduce the experiments in the Bayesian stacking paper.

If you would make this named list, you could name your models and names would show in the weight outputs

When there are many models the weights are easier. If you have two nested models there is a monotonic mapping between the weights and probabilities (more about this coming soonish).

Models 3 and 6 are best, but you can get better predictions by averaging predictions from models 3, 5, and 6.

When models are similar LOO-BB-weights dilute the weights between models having similar predictive performance. I would guess that models 1, 2, 4, 7, and 8 are somehow similar with each other or with the models 3, 5, or 6. When models are similar stacking weights choose from the very similar predictions the best, but averages over different predictive distributions if none of the predictive distributions is the true data generating distribution. So in this case models 3, 5, and 6 are making different kinds of predictions and it can useful to check how they differ. You may also consider Bayesian hierarchical stacking.

Not without additional information about the models, If they are nested, choose the most complex one.

Topic		Replies	Views
Model stacking and LOO (brms models) brms loo	11	7239	June 29, 2018
LOO Visualization / Stacking Modeling bayesplot , loo	4	823	March 24, 2020
Loo with k_threshold parameter vs. kfold for comparing rstanarm models rstanarm loo	5	1141	December 21, 2018
Should I use pseudo Bayesian model averaging+ for comparing nested models? Modeling loo	10	983	July 25, 2019
Stacking or pseudo BMA for a mixture of Negative Binomials Modeling	1	257	January 18, 2021

Loo_compare vs averaging/weighting via stacking or pseudo-BMA weighting

Related topics