Linear model assumptions check

Elef · April 28, 2020, 11:48am

Hi,

In Frequentist framework when someone runs a linear model has to check the model assumptions and this can be done easily in r by using the plot function
e.g.

plot( lm(y ~ x))

And from that plot can get a flavour of heteroscedasticity for example. As simple as that.

There is a simple way to perform similar check using rstan equivalent to plot()?

Thanks

bbbales2 · April 28, 2020, 8:03pm

I don’t think there’s anything automatic with rstan. There probably are with rstanarm and brms though.

I think you’d compute the residual in a generated quantities block. Assuming that is called residual, then you could pull that out in R with:

residual <- as.matrix(fit_simple, pars = "residual")

And that would give you a matrix of draws of residuals and you could plot that however.

bayesplot does some of this. Check the bottom example here: https://mc-stan.org/bayesplot/

Gelman had a post on assumptions of linear regression here you might find interesting: https://statmodeling.stat.columbia.edu/2013/08/04/19470/ .

abartonicek · April 28, 2020, 10:01pm

With brms, you can easily get the fitted/predicted values with fitted() and the residuals with residuals(). After that, it’s easy to make the predicted vs residuals plot using either the base R plot() function that you suggested, or the qplot() function from the tidyverse package.

The only thing you have to be wary of is that the objects returned by fitted() and residuals() on a brmsfit objects will be data frames with multiple columns: the mean draw (point estimate), standard error, 2.5% percentile, 97.5% percentile. For the fitted vs residual plot, you only need the point estimates, i.e. the first column.

Therefore, the full code for the fitted vs residuals plot would look something like this:

point_preds <- fitted(your_model)[, 1]
point_errs <- residuals(your_model)[, 1]

qplot(point_preds, point_errs)

abartonicek · April 28, 2020, 10:23pm

In addition, as far as I understand, posterior predictive checks (PPCs) in the Bayesian framework fill a similar role to checking of assumptions in the frequentist framework (or maybe a better thing to say would be that PPCs augment the checking of assumptions). For example, visualizing whether the distributions of simulated data drawn from the posterior predictive distribution match the distribution of your observed data is a rough check that your model isn’t horribly misspecified (e.g. by predicting normally distributed response when it really was distributed as a heavily skewed negative binomial). So in a way, PPCs are like checking of assumptions, although they don’t always mean your model breaks down if they’re not perfect, instead they often suggest ways of improving/expanding your model (which checking of assumptions can as well).

There’s a lot of options for different kinds of posterior predictive checks with the pp_check() function (see https://cran.r-project.org/web/packages/bayesplot/vignettes/graphical-ppcs.html).

Topic		Replies	Views
R-squared from rstan (and not rstanarm) Modeling	5	1861	March 18, 2025
Quick question about graphical prior predictive check Modeling	5	617	October 10, 2022
Function like rstanarm::posterior_predict() for models written in Rstan? Interfaces rstan	4	397	November 4, 2020
Using Pearson residuals to validate models brms	3	1981	September 24, 2020
Stancode in brms brms	5	804	July 16, 2020

Linear model assumptions check

Related topics