Bayesian R-squared posterior variance

Joaquin_Martinez-Minaya · April 7, 2021, 4:05pm

Hi!

I am working with Bayesian models using brms and I am so interested in understand Bayesian r-squared proposed by Gelman et al. (2019). A really good explanation in Bayesian R2 and LOO-R2

In my attempt to understand how this Bayesian r-squared works, I have done different simulations and fittings with different accuracies. In particular, I use same response variable for all the models with different covariates. Here the code:

library(brms)
library(dplyr)
library(ggplot2)

### data.frame1
set.seed(10)
n <- 100
x1 <- rnorm(n, 30, 1)

set.seed(100)
y <- 2*x1 + rnorm(n, sd = 0.1)
data1 <- data.frame(x = x1, y = y)
fit1 <- brm(y ~ x, data = data1,
            iter = 1000,
            cores = 2,
            chains = 2)
plot(data1$x, data1$y)

fit <- list(fit1)
standar_devs <- c(seq(0.1, 0.7, 0.2), 1, 10, 100)
for (j in 1:length(standar_devs))
{
  set.seed(j*100)
  x <- x1 + rnorm(n, sd = standar_devs[j])
  data2 <- data.frame(x = x, y = y)
  fit[[j + 1]] <- brm(y ~ x, data = data2,
              iter = 1000,
              cores = 2,
              chains = 2)
}

### Bayes_R2
fit %>%
  sapply(., function(x)brms::bayes_R2(x, summary=FALSE)) %>%
  as.data.frame() %>%
  tidyr::pivot_longer(., cols = 1:(length(standar_devs) + 1),  names_to = "dataset", values_to = "R_squared")-> r_squared_test4

r_squared_test4$dataset <- as.factor(r_squared_test4$dataset)

p_test4 <- r_squared_test4 %>%
  ggplot( aes(x = R_squared, fill = dataset)) +
  geom_histogram( color="#e9ecef", alpha=0.6, position = 'identity', bins = 100) +
  theme_bw() +
  labs(fill="") +
  xlim(c(0,1.01))
p_test4

Here the corresponding plot

In the plot (attached ), we can see that the greater is the R-squared, smaller is the variability of its distribution. I was wondering if it fulfills always, or am I missing something?

Many thanks for your time,

Best,
Joaquín.

avehtari · April 7, 2021, 6:08pm

Probably quite often, for example, in your simulation you’ll get high R^2 only if residual sigma is small and then coefficients are determined also with small variance. You could try examples with number of covariates close to the number of observations, and then you could sometimes get high R^2 as by chance the observations would lie on lower dimensional plane (but then you should compute LOO-R^2 anyway)

Topic		Replies	Views
bayes_R2 estimation brms	15	1453	June 17, 2020
Both bayes_R2 and loo_R2 get weird estimate 1 General brms	3	441	December 2, 2022
Help with understanding of bayesian inference using GLM and BRMS brms	3	1414	November 13, 2018
Residual variance for different families in {brms} General brms	1	50	April 20, 2025
loo_R2: documentation and comparison to bayes_R2() brms loo	5	2080	September 13, 2022

Bayesian R-squared posterior variance

Related topics