Posterior predictive plots for regression model without pooling

I would like to try out a few posterior predictive plots discussed in the paper “Visualization in Bayesian workflow” by Gabry et al. However, I’m not so sure how the simulated data for Model 1 (regression without pooling) were generated for those plots. Any R code to share? Thanks!

I think Jonah is making this into a case study.

Which plots are you looking for? The data is real (PM2.5 across ~3k cities). Are you talking about the prior simulation?

I would like to know how the data were generated and plotted for Figures 6a, 7a, 8a and 9a in the paper. Thanks!

I think this is everything you need (except obviously the data etc)

Edit to note: The posterior predictive simulations were done using the posterior_predict function.

library(ggplot2)
library(dplyr)
library(rstanarm)
library(bayesplot)
theme_set(bayesplot::theme_default())
library(loo)

mod1 <- stan_glm(log_pm25 ~ log_sat_2014, data=GMs@data)
loo1 <- loo(mod1)
yrep1 <- posterior_predict(mod1, draws = 400)
color_scheme_set("blue")
(hist1 <- ppc_hist(y, yrep1[1:5, ]))
ggsave(filename = "../plots/ppc_hist1.png")

ppc_dens_overlay(y, yrep1[1:100, ]) + 
  coord_cartesian(ylim = c(0, 0.7), xlim = c(0, 6)) +
  legend_none()
ggsave(filename = "../plots/ppc_dens1.png")


#stat:skew
skew <- psych::skew

color_scheme_set("blue")
ppc_stat(y, yrep1, stat = "skew", binwidth = 0.01) + 
  xlim(0, .6) + 
  legend_none()
ggsave(filename = "../plots/ppc_skew1.png")

superregion <-
  factor(GMs@data$super_region_name2 , 
         levels = c("High income", levels(superregion)[-2]))

color_scheme_set("blue")
ppc_stat_grouped(y, yrep1, group = superregion, 
                 stat = "median", binwidth = 0.01) + 
  facet_text(size = rel(0.75)) + 
  xaxis_text(FALSE) + 
  legend_none()
ggsave(filename = "../plots/ppc_med_grouped1.png")

# LOO-PIT
psis1 <- psislw(-log_lik(mod1))
pit1 <- rstantools::loo_pit(object = posterior_predict(mod1), y = y, lw = psis1$lw_smooth)
unifs <- matrix(runif(length(pit1) * 100), nrow = 100)
color_scheme_set("blue")
ppc_dens_overlay(pit1, unifs) + 
  legend_none() + 
  coord_cartesian(ylim = c(0, 2), xlim = c(0.1, 0.9))
ggsave(filename = "../plots/ppc_loo_pit_overlay1.png")
1 Like

Thanks a lot! I really appreciate it!

Just saw this old topic. In case anyone sees this going forward the code from the Visualization in Bayesian Workflow paper is all now available on GitHub at

Thanks for the note, Jonah! I’ll take a look at the code soon…

Thanks,

Gang