Visualizing predictive uncertainty

In RaOS, figure 11.1 shows a nice way to plot inferential uncertainty of a model, as I understand it.

I’m searching for a similarly easy way to plot predictive uncertainty, both in the given range of the predictor (momiq in this case) and extrapolated (for a mom with an IQ above the 140 in the data).

You can reproduce fig. 11.1 with these lines from

#' ## Displaying uncertainty in the fitted regression

#' #### A single continuous predictor 
theme_set(bayesplot::theme_default(base_family = "sans"))

#' #### Load children's test scores data
kidiq <- read.csv(root("KidIQ/data","kidiq.csv"))

sims_2 <- as.matrix(fit_2)
n_sims_2 <- nrow(sims_2)
subset <- sample(n_sims_2, 10)
plot(kidiq$mom_iq, kidiq$kid_score,
     xlab="Mother IQ score", ylab="Child test score")
for (i in subset){
  abline(sims_2[i,1], sims_2[i,2], col="gray")
abline(coef(fit_2)[1], coef(fit_2)[2], col="black")

There’s also a ggplot2 version following that in the linked code.

I figure there’s probably an easy, idiomatic approach, but I haven’t found it yet.

Operating System: Linux, Windows
Interface Version: rstanarm

I knew this would be an embarrassingly easy question. I’m now reading chapter 9–especially section 9.2.

1 Like

Feel free to post what you came up with for others that come after you :)

Mike, sure.

I see two pieces to the puzzle: sampling the desired distribution, and then producing the visualization.

The first piece: If you have a copy of RaOS, see section 9.2.

  • posterior_linpred(fit_2) or posterior_epred(fit_2) give the linear predictor with uncertainty–IOW, the inferential estimate.

  • posterior_predict(fit_2) produces a sample showing the full predictive uncertainty.

All three function calls take a newdata = argument, if you want to try different predictors.

The second piece starts out easy, too. Here’s a graph of the predictive uncertainty for the fit_2 model:

 ppc_intervals(y = kidiq$kid_score, yrep = posterior_predict(fit_2), x = kidiq$mom_iq, prob = 0.5, prob_outer = 0.9) 

There is a third piece: visualizing the model with new data, either inside or outside the original domain. ppc_intervals() wants x, y, and yrep to be commensurate (the lengths of the two vectors must both equal the number of columns of yrep.

What I haven’t found yet is a function similar to ppc_intervals() that, say, has shows the prediction for new values of x. Any suggestions?