Hi,
Which is the quickest/easiest way to find the R-squared from rstan (and not rstanarm)
Thanks
Hi,
Which is the quickest/easiest way to find the R-squared from rstan (and not rstanarm)
Thanks
Hi Elef!
There are different flavors of Bayesian-R^2. You might want to check out this. The residual based R^2 is also described here. From these sources you get the function, that works with rstanarm
.
For the residal version:
bayes_R2_res <- function(fit) {
y <- rstanarm::get_y(fit)
ypred <- rstanarm::posterior_linpred(fit, transform = TRUE)
if (family(fit)$family == "binomial" && NCOL(y) == 2) {
trials <- rowSums(y)
y <- y[, 1]
ypred <- ypred %*% diag(trials)
}
e <- -1 * sweep(ypred, 2, y)
var_ypred <- apply(ypred, 1, var)
var_e <- apply(e, 1, var)
var_ypred / (var_ypred + var_e)
}
And for the modeled (approximate) variance version:
bayes_R2 <- function(fit) {
mupred <- rstanarm::posterior_linpred(fit, transform = TRUE)
var_mupred <- apply(mupred, 1, var)
if (family(fit)$family == "binomial" && NCOL(y) == 1) {
sigma2 <- apply(mupred*(1-mupred), 1, mean)
} else {
sigma2 <- as.matrix(fit, pars = c("sigma"))^2
}
var_mupred / (var_mupred + sigma2)
}
You can work from there. It’s easiest if your Stan model has the linear predictor mu
in either the transformed parameters or the generated quantities block, so you can easily extract this. Then do something like (for the second case - modeled variance):
bayes_R2 <- function(mu, sigma) {
var_mu <- apply(mu, 1, var)
sigma2 <- sigma^2
var_mu / (var_mu + sigma2)
}
Note that this is for a linear regression model.
Hi Mat_Mantei,
Thanks for the response and the resources. Unfortunately, I can’t use the first two suggested functions because the object I have is of class stanfit therefore I can’t use the functions from rstanarm. Since I am new with Bayesian analysis and Stan maybe is better to post the simple example I am working with.
library(rstan)
library(rethinking)
library(dplyr)
data(Howell1)
dat1 <- Howell1
d2 <- dat1[ dat1$age >= 18 , ]
plot( d2$height ~ d2$weight )
lm(d2$height ~ d2$weight) %>% summary
model_stan2 <- '
data{
int <lower=1> N;
vector[N] height;
vector[N] weight;
}
parameters{
real alpha;
real beta;
real<lower=0> sigma;
}
transformed parameters{
vector[N] height_hat;
height_hat = alpha + beta * weight;
}
model{
vector[N] mu;
// Priors
alpha ~ normal(178, 20);
beta ~ lognormal(0,1);
sigma ~ exponential(1);
mu = alpha + beta*(weight);
// Likelihood
height ~ normal( mu, sigma);
}
'
stan_samples2 <- stan(model_code = model_stan2, model_name='Model height & weight',
data = list(
N= nrow(d2),
height=d2$height,
weight=d2$weight
)
)
And then by using your last function
bayes_R2 <- function(mu, sigma) {
var_mu <- apply(mu, 1, var)
sigma2 <- sigma^2
var_mu / (var_mu + sigma2)
}
bayes_R2(ext_pred2$height_hat , ext_pred2$sigma ) %>% mean
I got a figure which looks reasonable