Manually computing R^2 and receiving negative output

Hello, I am using brms.predict on a brms model to get a set of predicted y values using the model weights. I then want to see how much variation in the original y labels can be explained by the predicted y labels. However, I can’t use the Bayes_R2 function because the output of brms.predict is a vector of estimates, not a brmsfit object. Therefore I am calculating the coefficient of determination manually in the following way:

SSresidual <- sum((originalY - predictedY)^2,na.rm=TRUE)
SStotal <- sum((originalY - mean(originalY, na.rm=TRUE))^2,na.rm=TRUE)
my_R2 <- 1 - (SSresidual/SStotal)

My next step is to use a bootstrapping approach to see how accurate my prediction is compared to noise. I compute a series of R^2 calculated in the same way as above but with randomly permuted Y labels as the predictedY.

The problem is, when I randomly permute the original Y labels to form a ‘prediction’ and manually calculate R^2 as shown above, I often get negative R^2 values. I am wondering whether the above approach is sensible, and if so why my approach yields negative R^2 and what to do about it. Thanks, I appreciate any advice.

Why don’t you use the function on your model fit object? bayes_R2(your_model)

My understanding of the bayes_R2 function is that it outputs the proportion of variation of originalY that is explained by the model predictors in the brmsfit object – however, what I want to compute is the proportion of variation in originalY that is explained by predictedY, which is not part of a brmsfit object. So I’ve computed R^2 manually, but I’m not sure why the output is sometimes negative.

I see, sorry for my misunderstanding.

I see. You are using the formula R^2=1-\frac{SS_{res}}{SS_{total}} which is the classical way to calculate R^2 for regression problem. However, brms uses Bayesian R^2=\frac{ var(y^{prd})}{var(y^{prd})+var(res)}

var(predictedY) / ( var(predictedY)+ var(originalY - predictedY))

Reference: R-squared for Bayesian Regression Models


Thank you for your quick response and reference - I have re-run the script using the Bayesian R^2 equation and I no longer get negative R^2 values. Much appreciated!

1 Like