R2 vs bayes_R2

nlpacestan · January 23, 2020, 8:56pm

I have estimated stan_lm models using rstanarm.

One of the estimates is labelled R2.

There is also the bayes_R2.stanreg function.

Both the single R2 estimate (with mean, sd, percentiles, etc) and the bayes_R2 vector of estimates come from the posterior draws.

I don’t understand the distinction/purposes of the two R2’s.

I haven’t found similar questions in the forum.

Nathan

bbbales2 · January 23, 2020, 9:31pm

I don’t know the difference, but check here: http://hbiostat.org/papers/rms/accuracy/bayes/gel18r2.pdf

bgoodri · January 24, 2020, 12:42am

The bayes_R2 function can be called on any generalized linear model (stan_glm), even those with group-specific parameters (stan_glmer). For the particular function stan_lm, the R^2 is a primitive parameter whose posterior distribution is being estimated, whereas bayes_R2 is essentially a generated quantity rather than a primitive parameter. Either way, they are referring to the same concept and should have the same distribution theoretically.

mcol · January 24, 2020, 12:53am

If you find bayes_R2 useful and want to report it in your work, I suggest you to try loo_R2 too, which tends to suffer less from potential overfitting. See https://avehtari.github.io/bayes_R2/bayes_R2.html for more info.

nlpacestan · January 24, 2020, 11:34pm

Thanks for the several comments. I have looked at the Am Stat paper with supplement.

My stan_lm model is log(continuous response) ~ 33 predictors (15 covariates and factors); the posterior sample size is 32000.

The “primitive” parameter R2 is 0.4 with 90% CI (0.4, 0.4), sd 0.0, mcse 0.0 and Rhat 1.001.

I attempted to use bayes_R2 and loo_R2 on the fitted object.

I am working in a container with 120 Gb on a linux server running R 3.6.1 and rstanarm 2.19.2.

Both the bayes_R2 and the loo_R2 calls returned:

Error: cannot allocate vector of size 263.3 Gb.

I will use the R2 reported with the fit.

Is the bayes_R2 error expected with the posterior sample size of my fit?

Nathan

mcol · January 24, 2020, 11:49pm

Could you try running again bayes_R2 and type traceback() immediately after the error? I’d like to see where exactly this allocation is being attempted.

nlpacestan · January 25, 2020, 12:11am

StandardizedOME.stan_lm.5c.R2 ←

bayes_R2(StandardizedOME.stan_lm.5c)
Error: cannot allocate vector of size 263.3 Gb

traceback()
7: linear_predictor.matrix(beta, x, data$offset)
6: linear_predictor(beta, x, data$offset)
5: pp_eta(object, data = dat, draws = draws)
4: posterior_linpred.stanreg(object, transform = TRUE, re.form = re.form)
3: posterior_linpred(object, transform = TRUE, re.form = re.form)
2: bayes_R2.stanreg(StandardizedOME.stan_lm.5c)
1: bayes_R2(StandardizedOME.stan_lm.5c)

nlpacestan · January 25, 2020, 12:16am

Here it is for loo_R2

StandardizedOE.stan_lm.5c ←

loo_R2(StandardizedOME.stan_lm.5c)
Error: cannot allocate vector of size 263.3 Gb

traceback()
5: vapply(seq_len(args$N), FUN = function(i) {
as.vector(fun(data_i = args$data[i, , drop = FALSE], draws = args$draws))
}, FUN.VALUE = numeric(length = args$S))
4: log_lik.stanreg(object)
3: log_lik(object)
2: loo_R2.stanreg(StandardizedOME.stan_lm.5c)
1: loo_R2(StandardizedOME.stan_lm.5c)

bgoodri · January 25, 2020, 12:44am

That is all you need. The bayes_R2 is no better of an estimate and loo_R2 tends to differ only with small datasets. You might print yourself a second decimal place by doing print(fit, digits = 2).

Topic		Replies	Views
Both bayes_R2 and loo_R2 get weird estimate 1 General brms	3	444	December 2, 2022
loo_R2: documentation and comparison to bayes_R2() brms loo	5	2096	September 13, 2022
bayes_R2 estimation brms	15	1458	June 17, 2020
bayes_R2, loo_R2 and hs() priors General	8	700	October 31, 2020
R-squared from rstan (and not rstanarm) Modeling	5	1870	March 18, 2025

R2 vs bayes_R2

Related topics