Hello, I would like to understand better the difference between posterior_predict and posterior_linpred, so I was trying to reproduce the results of the first from the second. Of course I am not able to.
In my case the model is a negative binomial spline model with offsets. if I compare the medians of the posterior_linpred with transform = TRUE and posterior_predict (with the same offset vector) I get that, on average, the predictions of _predict are the 0.93 of those of _linpreds.
posterior_linpred() gives you posterior draws of \eta.
posterior_linpred(transform=TRUE), or equivalently posterior_epred(), gives you posterior draws of \lambda, that is, it does the inverse link transformation for you.
posterior_predict() gives you draws from {\rm NegBinom}(\lambda,
\phi), that is it uses a NegBinom random number generator and includes the dispersion information in \phi.
What do you get if you compare the means instead of the medians?
Thank you jonah! that’s what I thought, but even trying to apply the model I can’t reproduce the results. Here i extract the median prediction via linpred, via linpred passed through a negative binomial and via predict:
Hmm, one thing I notice is that in the last two lines it looks like you have mean for posterior_linpred but still median for posterior_predict. Is it closer if you use mean for both?
Also, here’s a quick check I did using a simple example model:
# stan_glm.nb is same as stan_glm with family=neg_binomial_2
fit <- stan_glm.nb(Days ~ Sex/(Age + Eth*Lrn), data = MASS::quine)
# posterior_epred is same as posterior_linpred(transform=T)
plot(colMeans(posterior_epred(fit)), colMeans(posterior_predict(fit)))
abline(a = 0, b = 1, col = "red")
which produces a plot that looks like I would expect:
Indeed using the mean in both _linpred and _predict gives the same results (after rounding)!
So why with the medians they do not, while the the first two methods (median linepred and qnbinom) agree?
I think it’s easiest to see why the mean and median behave differently if we focus on a single observation. For example, for the model I posted above here’s a plot of the posterior distribution of \lambda_i vs the predictive distribution of y^{\rm rep}_i for i=1, i.e., the first column from posterior_epred() vs the first column from posterior_predict(). Because of the shapes of the distributions (the latter including the impact of the dispersion term), they have the same mean but different medians:
fit <- stan_glm.nb(Days ~ Sex/(Age + Eth*Lrn), data = MASS::quine)
lambda_1 <- posterior_epred(fit)[, 1]
yrep_1 <- posterior_predict(fit)[, 1]
make_hist <- function(x) {
hist(x, main = deparse(substitute(x)))
abline(v = mean(x), col = "red")
abline(v = median(x), col = "blue")
legend("right",
legend = c(paste("mean =", round(mean(x), 1)),
paste("median =", round(median(x), 1))),
col = c("red", "blue"),
lty = 1)
}
make_hist(lambda_1)
make_hist(yrep_1)
In both cases the mean is about 18 while the medians differ.
Does that help explain the reason for what you’re seeing?