Z-score posterior predictive check plot

Our likelihood is y ~ normal(y_mean, y_sigma);

We want a graphical posterior predictive check with test quantity T(y, \theta) = \text{density}(z(y, \theta)) where z(y, \theta) = \frac{y - \texttt{y_mean}}{\texttt{y_sigma}}. Since the test quantity involves \theta, the test quantity needs S simulated density lines (see BDA3 p.147-148).

Given the model, we know T(y^\text{rep}, \theta) = \text{density}(z(y^\text{rep}, \theta)) \sim N(0,1). So we only need one density line.

So I use ppc_dens_overlay() backwards because of dimensions, but I will need to adjust colors and legend:

y_mat = matrix(rep(y,each=sims),nrow=sims)
# create z-scores, grab the first 20 simulations for display:
T_y_theta = ((y_mat - fit.stan$y_mean)/fit.stan$y_sigma)[1:20, ]
T_yrep_theta = rnorm(n = length(y), mean = 0, sd = 1)
ppc_dens_overlay(y    = T_yrep_theta, 
                 yrep = T_y_theta)

I could instead do this instead directly in ggplot, but wanted to ask if there is a nice bayesplot way to do this ?

Thank you !

1 Like

That’s an interesting situation. I hadn’t anticipated that but it makes sense. In terms of editing the colors and legend labels, I think you can do what you need using the internal scale_color_ppc_dist() helper function in bayesplot that we haven’t yet exposed to users (so you have to prefix it with a triple colon like in the code below). Does this work?

ppc_dens_overlay(y = T_yrep_theta, yrep = T_y_theta) + 
  bayesplot:::scale_color_ppc_dist(values = c("red", "black"), labels = c("y", "yrep"))

You can change values to whatever colors you want and labels to whatever names you want. The order of the names in labels can be changed too, which you probably need in your case because you’re using the function backwards.

1 Like

Yes, works great !

Thanks for humoring this somewhat-weird usage. :)

1 Like

Awesome, glad that works!