LOO-R2: With or without Bayesian bootstrap?

Homer · October 23, 2020, 2:55pm

In this case study, the LOO-adjusted R^2 uses the Bayesian bootstrap (in addition to the LOO-CV-based predictions \hat{y}_{\text{loo},n}). In contrast, this answer and brms seem to omit the Bayesian bootstrap.

The case study from above suggests (but not explicitly says) that the Bayesian bootstrap accounts for the fact that the true data-generating distribution (for y) is unknown. But couldn’t one argue that the true data-generating distribution is reflected by the observed data? If yes, then wouldn’t the Bayesian bootstrap introduce additional sampling uncertainty (sampling in the frequentist sense of repeating the data observation process)? If this is correct, then wouldn’t it make sense to omit the Bayesian bootstrap (like in the answer linked above and in brms)?

So my question is: Is it incorrect to use the Bayesian bootstrap for the LOO-adjusted R^2 or is the Bayesian bootstrap optional for the LOO-adjusted R^2?

avehtari · October 23, 2020, 3:08pm

See Uncertainty in Bayesian Leave-One-Out Cross-Validation Based Model Comparison for explanation. That paper uses elpd as the example, but the same about the sources of uncertainty and variation holds for LOO-R2.

Homer · October 26, 2020, 12:44pm

Thanks for your reply. Do I understand your hint to the paper correctly that the Bayesian bootstrap takes the uncertainty into account which arises when regarding the LOO-adjusted R^2 as a frequentist estimator? In that sense, is it correct that the Bayesian bootstrap would be optional for the LOO-adjusted R^2 (depending on whether or not this frequentist uncertainty should be taken into account)?

avehtari · October 26, 2020, 7:23pm

No. Bayesian bootstrap as the name says is Bayesian approach. You may get confused as the paper analyses the frequency properties of of Bayesian approach.

Bayesian bootstrap is one way to estimate the lack of knowledge about the future data distribution. If we assume that the future data distribution is the same as where the observed data was generated, then we can use the observed data as a proxy, but as it just a finite number of observations there is uncertainty about the actual distribution. Bayesian bootstrap is equal to having Dirichlet distribution model for the future data. It’s kind of silly simple model, but it makes minimal assumptions and happens to get the first moments right with easy to implement computation. It’s optional in that sense that there are other algorithms for the same task…

Homer · October 27, 2020, 6:51am

Ok, but then this answer and brms (same links as above) do not take that uncertainty into account you are talking about? (I guess we are talking about the same kind of uncertainty, I might just not have expressed it correctly.)

avehtari · October 29, 2020, 3:41pm

The first answer was a quick first version of the function. LOO-R2 code in online appendix for paper R-squared for Bayesian regression models includes Bayesian bootstrap and rstanarm::loo_R2 does that, too. I don’t know why @paul.buerkner decided to leave it out from brms::loo_R2, but he can comment. I added a note to that old posting about the later implemented functions.

paul.buerkner · October 29, 2020, 7:50pm

No good reason. I will check and then implement it.

Homer · October 30, 2020, 11:39am

Now I see what I had overseen in this case study: For the LOO-adjusted R^2, there are not S different values of \hat{y}_{\text{loo}, n}, unlike for the residual-based (and also for the model-based) Bayesian R^2 where \hat{y}^{s}_{n} depends on s \in \{1, \dots, S\} (with S denoting the number of posterior draws). Thus, it’s clear that the LOO-adjusted R^2 needs a different way to take the finite-sample uncertainty into account. In the case study, I somehow read \hat{y}^{s}_{\text{loo}, n} instead of \hat{y}_{\text{loo}, n} which made me wonder why the finite-sample uncertainty was accounted for twice. Sorry for wasting your time on this.

paul.buerkner · October 30, 2020, 1:29pm

brms from github now uses Bayesian bootrap in loo_R2.

Topic		Replies	Views
loo_R2: documentation and comparison to bayes_R2() brms loo	5	2061	September 13, 2022
LOO and bayes_R2 (seem to) contradict posterior predictive check Modeling loo , posterior-predictive , brms	14	1264	October 14, 2022
Bayesian R-squared posterior variance General validation	1	610	April 7, 2021
Both bayes_R2 and loo_R2 get weird estimate 1 General brms	3	440	December 2, 2022
Loo, model_weights and brms models with sample_prior = "only" Modeling loo , brms	3	529	December 11, 2020

LOO-R2: With or without Bayesian bootstrap?

Related topics