Hi,
Which is the quickest/easiest way to find the R-squared from rstan (and not rstanarm)
Thanks
Hi,
Which is the quickest/easiest way to find the R-squared from rstan (and not rstanarm)
Thanks
Hi Elef!
There are different flavors of Bayesian-R^2. You might want to check out this. The residual based R^2 is also described here. From these sources you get the function, that works with rstanarm
.
For the residal version:
bayes_R2_res <- function(fit) {
y <- rstanarm::get_y(fit)
ypred <- rstanarm::posterior_linpred(fit, transform = TRUE)
if (family(fit)$family == "binomial" && NCOL(y) == 2) {
trials <- rowSums(y)
y <- y[, 1]
ypred <- ypred %*% diag(trials)
}
e <- -1 * sweep(ypred, 2, y)
var_ypred <- apply(ypred, 1, var)
var_e <- apply(e, 1, var)
var_ypred / (var_ypred + var_e)
}
And for the modeled (approximate) variance version:
bayes_R2 <- function(fit) {
mupred <- rstanarm::posterior_linpred(fit, transform = TRUE)
var_mupred <- apply(mupred, 1, var)
if (family(fit)$family == "binomial" && NCOL(y) == 1) {
sigma2 <- apply(mupred*(1-mupred), 1, mean)
} else {
sigma2 <- as.matrix(fit, pars = c("sigma"))^2
}
var_mupred / (var_mupred + sigma2)
}
You can work from there. It’s easiest if your Stan model has the linear predictor mu
in either the transformed parameters or the generated quantities block, so you can easily extract this. Then do something like (for the second case - modeled variance):
bayes_R2 <- function(mu, sigma) {
var_mu <- apply(mu, 1, var)
sigma2 <- sigma^2
var_mu / (var_mu + sigma2)
}
Note that this is for a linear regression model.
Hi Mat_Mantei,
Thanks for the response and the resources. Unfortunately, I can’t use the first two suggested functions because the object I have is of class stanfit therefore I can’t use the functions from rstanarm. Since I am new with Bayesian analysis and Stan maybe is better to post the simple example I am working with.
library(rstan)
library(rethinking)
library(dplyr)
data(Howell1)
dat1 <- Howell1
d2 <- dat1[ dat1$age >= 18 , ]
plot( d2$height ~ d2$weight )
lm(d2$height ~ d2$weight) %>% summary
model_stan2 <- '
data{
int <lower=1> N;
vector[N] height;
vector[N] weight;
}
parameters{
real alpha;
real beta;
real<lower=0> sigma;
}
transformed parameters{
vector[N] height_hat;
height_hat = alpha + beta * weight;
}
model{
vector[N] mu;
// Priors
alpha ~ normal(178, 20);
beta ~ lognormal(0,1);
sigma ~ exponential(1);
mu = alpha + beta*(weight);
// Likelihood
height ~ normal( mu, sigma);
}
'
stan_samples2 <- stan(model_code = model_stan2, model_name='Model height & weight',
data = list(
N= nrow(d2),
height=d2$height,
weight=d2$weight
)
)
And then by using your last function
bayes_R2 <- function(mu, sigma) {
var_mu <- apply(mu, 1, var)
sigma2 <- sigma^2
var_mu / (var_mu + sigma2)
}
bayes_R2(ext_pred2$height_hat , ext_pred2$sigma ) %>% mean
I got a figure which looks reasonable
I’m wondering if there is a general solution for this problem for larger stan models. I think this might work, but just wanted to respond to this in case it’s useful (or perhaps not a good approach!) for others trying to calculate R squared from a stan model.
In the model below I added a generated quantities block that has two additional quantities. One is the posterior predictions, and one is the equivalent of expected value of the posterior predictions (e.g., like posterior_epred). I calculated the second with a user defined function.
model_stan2 <- '
functions {
real linear_predictor(real alpha, real beta, real x) {
real out;
out = alpha + beta*x;
return out;
}
}
data{
int <lower=1> N;
vector[N] height;
vector[N] weight;
}
parameters{
real alpha;
real beta;
real<lower=0> sigma;
}
transformed parameters{
vector[N] height_hat;
height_hat = alpha + beta * weight;
}
model{
vector[N] mu;
// Priors
alpha ~ normal(178, 20);
beta ~ lognormal(0,1);
sigma ~ exponential(1);
mu = alpha + beta*(weight);
// Likelihood
height ~ normal( mu, sigma);
}
generated quantities {
vector[N] y_pred_no_variance; // generated quantities for predicted values
vector[N] y_pred_w_variance;
for (i in 1:N) {
y_pred_no_variance[i] = linear_predictor(alpha, beta, x[i]);
y_pred_w_variance[i] = normal_rng(alpha + beta * x[i], sigma);
}
}
'
I then just followed similarly to this code, as cited above.
y_pred_solved <- rstan::extract(stan_samples2)$y_pred
residuals_y <- rstan::extract(stan_samples2)$y_pred_w_variance - rstan::extract(fit_out_tester)$y_pred
var_residuals <- apply(residuals_y, 1, var)
var_pred_out <- apply(y_pred_solved, 1, var)
hist(var_pred_out/(var_pred_out + var_residuals))
This was so helpful, it really helps with large models! Just one question what is the use of y_pred_w_variance_samples, sorry if it is a simple question but I’m new using Stan.
When we do posterior predictive inference, there are two forms of uncertainty that you see in the formula:
\displaystyle p(y^\text{new} \mid y^\text{obs}) = \int_{\mathbb{R}^N} p(y^\text{new} \mid \theta) \cdot p(\theta \mid y^\text{obs}) \, \textrm{d}\theta.
If we want to draw a new y^\text{new} we need to grab a parameter draw $\theta^{(n)} \sim p(\theta \mid y^\text{obs})$$, the sampling variation for which is our estimation uncertainty. Then given \theta^{(n)}, we take a draw from the sampling distribution y^\text{new} \sim p(y \mid \theta^{(n)}).
The first line y_pred_no_variance
gives you only the estimation uncertainty in the linear predictor. The second line, y_pred_w_variance
adds in the sampling variation. If you want to get the posterior for y^\text{new} in the posterior predictive distribution p(y^\text{new} \mid y^\text{obs}), then you need both kinds of uncertainty. This will give you the posterior mean and the correct posterior variance. In this particular case of a linear regression, we know that the posterior mean of y^\text{new} is the posterior mean of \alpha + \beta \cdot x^\text{new}, so we only need the first uncertainty if all we care about is the posterior mean.