I’ve read some of the similar posts on computing the pp p-value (here), but wanted to know if there’s a way to plot this value in a grouped data setting. For example, using the ppc_stat_grouped or pp_check with group variable?
Welcome to the Stan forums. Here’s an example how you can do this with grouping. I’m using median
but you can use whatever you want.
library(bayesplot)
library(ggplot2)
y <- example_y_data()
yrep <- example_yrep_draws()
group <- rep(c("A","B"), times = length(y)/2)
stat_y <- tapply(y, group, median)
stat_yrep <- t(apply(yrep, 1, function(row) tapply(row, group, median)))
p_vals <- colMeans(stat_yrep > matrix(stat_y,
nrow = nrow(stat_yrep),
ncol = length(stat_y),
byrow = TRUE))
# to pass to geom_text()
# you can change these x,y coordinates to whatever you want
# using Inf here plus hjust,vjust below will put labels into the top-right corner
label_df <- data.frame(
group = names(p_vals),
x = Inf,
y = Inf,
label = paste("p =", p_vals)
)
ppc_stat_grouped(y, yrep, group = group, stat = "median") +
geom_text(data = label_df,
aes(x, y, label = label),
hjust = 1.1, vjust = 1.1)
That makes this plot: