Posterior_predict() prediction scale/transformation?


#1

For irrelevant reasons, I am comparing probability scale training data predictions derived from glm() and stan_glm() logistic regression models. To obtain probability scale predictions from a stan_glm model, I thought I needed to transform from log-odds to probability via the inverse logit.

However, it seems like no transformation is needed for probability scale predictions, as posterior_predict() with no transform function generates the correct probabilities.

I assume I’m missing something obvious, but it would be nice to know why the transformation is not necessary. Or, if that conclusion is wrong, it would be nice to know what is really going on.

Thanks!

Here’s an example of how I’m proceeding:

dat <- data.frame(y = rbinom(500, 1, .75))
glm.pred <- predict(glm(y~1, data=dat), type='response')

stan.mod <- stan_glm(y~1, data=dat)
stan.pred.0 <- colMeans(posterior_predict(stan.mod))
stan.pred.1 <- colMeans(posterior_predict(stan.mod, fun=stats::plogis))

preds <- cbind(glm.pred, stan.pred.0, stan.pred.1)
head(preds)

#2

The family argument in stan_glm defaults to gaussian, so you are doing a linear “probability” “model” but if you then apply the standard logistic CDF to its predictions, it sort of yields something close to the right answer.

What you should be doing is

stan.mod <- stan_glm(y~1, data=dat, family = binomial)
stan.mu <- posterior_linpred(stan.mod, transform = TRUE)

which generates the posterior distribution of the conditional mean in a logit model. Note that this is not the posterior predictive distribution of the (future) outcomes, which is what is generated by posterior_predict and yields a matrix of 0s and 1s that are predictions for the observable outcomes.


#3

Ah, sorry, I should have looked more closely at my example. I am indeed using family = binomial. And, thank you for clarifying the difference between posterior_linpred and posterior_predict.

I didn’t realize that posterior_predict() generated predictions on the scale of the response, so I was unknowingly transforming 1s and 0s with the inverse logit.

Thanks!