I’m fitting a model with rstanarm::stan_glmer()
that has a negative binomial distribution, and I’m trying to figure out a useful way to display the PPC checks. The challenge is that the data are really highly skewed (lots of zeros and ones) with a really long tail, so (for example) the bayesplot::ppc_dens_overlay()
looks like this:
Which is not super useful. I was checking out the PPCs for discrete data which is useful but I think solves a different problem.
Essentially my question is: for an example like this, what is a useful and meaningful way to actually plot the posterior predictive samples in a way that can be interpreted? My only two thoughts are 1) split the plot into a couple of different x-axes that have comparable y-axis values, or 2) log the values, but that feels super wrong for reasons I’m not sure of.
Any thoughts welcome!
here’s a working example if that’s useful:
data <- rnbinom(25000, size = 0.099, mu = 0.34)
df <- data.frame(
y = data,
fixed = sample(as.factor(c(1:25)), 25000, replace = TRUE),
random1 = sample(as.factor(c(1:25)), 25000, replace = TRUE),
random2 = sample(as.factor(c(1:55)), 25000, replace = TRUE)
)
model <- rstanarm::stan_glmer(
y ~ fixed + (1 | random1) + (1 | random2),
data = df,
family = rstanarm::neg_binomial_2(link = "log")
)
bayesplot::ppc_dens_overlay(df$y, rstanarm::posterior_predict(model, draws = 500))