R-squared from rstan (and not rstanarm)

Hi,

Which is the quickest/easiest way to find the R-squared from rstan (and not rstanarm)

Thanks

1 Like

Hi Elef!

There are different flavors of Bayesian-R^2. You might want to check out this. The residual based R^2 is also described here. From these sources you get the function, that works with rstanarm.

For the residal version:

bayes_R2_res <- function(fit) {
  y <- rstanarm::get_y(fit)
  ypred <- rstanarm::posterior_linpred(fit, transform = TRUE)
  if (family(fit)$family == "binomial" && NCOL(y) == 2) {
    trials <- rowSums(y)
    y <- y[, 1]
    ypred <- ypred %*% diag(trials)
  }
  e <- -1 * sweep(ypred, 2, y)
  var_ypred <- apply(ypred, 1, var)
  var_e <- apply(e, 1, var)
  var_ypred / (var_ypred + var_e)
}

And for the modeled (approximate) variance version:

bayes_R2 <- function(fit) {
  mupred <- rstanarm::posterior_linpred(fit, transform = TRUE)
  var_mupred <- apply(mupred, 1, var)
  if (family(fit)$family == "binomial" && NCOL(y) == 1) {
      sigma2 <- apply(mupred*(1-mupred), 1, mean)
  } else {
      sigma2 <- as.matrix(fit, pars = c("sigma"))^2
  }
  var_mupred / (var_mupred + sigma2)
}

You can work from there. It’s easiest if your Stan model has the linear predictor mu in either the transformed parameters or the generated quantities block, so you can easily extract this. Then do something like (for the second case - modeled variance):

bayes_R2 <- function(mu, sigma) {
  var_mu <- apply(mu, 1, var)
   sigma2 <- sigma^2
  var_mu / (var_mu + sigma2)
}

Note that this is for a linear regression model.

3 Likes

Hi Mat_Mantei,

Thanks for the response and the resources. Unfortunately, I can’t use the first two suggested functions because the object I have is of class stanfit therefore I can’t use the functions from rstanarm. Since I am new with Bayesian analysis and Stan maybe is better to post the simple example I am working with.

library(rstan)
library(rethinking)
library(dplyr)

data(Howell1)
dat1 <- Howell1
d2 <- dat1[ dat1$age >= 18 , ]

plot( d2$height ~ d2$weight )
lm(d2$height ~ d2$weight) %>% summary

model_stan2 <- '
data{
 int <lower=1> N;
 vector[N] height;
 vector[N] weight;
}

parameters{
 real alpha;
 real beta;
 real<lower=0> sigma;
}

transformed parameters{
 vector[N] height_hat;
 height_hat = alpha + beta * weight; 
}

model{
 vector[N] mu;

// Priors
 alpha ~ normal(178, 20); 
 beta ~ lognormal(0,1); 
 sigma ~ exponential(1);  

 mu = alpha + beta*(weight);

// Likelihood
 height ~ normal( mu, sigma);
}
'

stan_samples2 <- stan(model_code = model_stan2, model_name='Model height & weight',
	data = list(
		N= nrow(d2),
		height=d2$height, 
		weight=d2$weight
		)
	)

And then by using your last function

bayes_R2 <- function(mu, sigma) {
  var_mu <- apply(mu, 1, var)
   sigma2 <- sigma^2
  var_mu / (var_mu + sigma2)
}

bayes_R2(ext_pred2$height_hat , ext_pred2$sigma   ) %>% mean

I got a figure which looks reasonable