Posterior_predict() prediction scale/transformation?

joeHoover · January 22, 2018, 11:16pm

For irrelevant reasons, I am comparing probability scale training data predictions derived from glm() and stan_glm() logistic regression models. To obtain probability scale predictions from a stan_glm model, I thought I needed to transform from log-odds to probability via the inverse logit.

However, it seems like no transformation is needed for probability scale predictions, as posterior_predict() with no transform function generates the correct probabilities.

I assume I’m missing something obvious, but it would be nice to know why the transformation is not necessary. Or, if that conclusion is wrong, it would be nice to know what is really going on.

Thanks!

Here’s an example of how I’m proceeding:

dat <- data.frame(y = rbinom(500, 1, .75))
glm.pred <- predict(glm(y~1, data=dat), type='response')

stan.mod <- stan_glm(y~1, data=dat)
stan.pred.0 <- colMeans(posterior_predict(stan.mod))
stan.pred.1 <- colMeans(posterior_predict(stan.mod, fun=stats::plogis))

preds <- cbind(glm.pred, stan.pred.0, stan.pred.1)
head(preds)

bgoodri · January 22, 2018, 11:41pm

The family argument in stan_glm defaults to gaussian, so you are doing a linear “probability” “model” but if you then apply the standard logistic CDF to its predictions, it sort of yields something close to the right answer.

What you should be doing is

stan.mod <- stan_glm(y~1, data=dat, family = binomial)
stan.mu <- posterior_linpred(stan.mod, transform = TRUE)

which generates the posterior distribution of the conditional mean in a logit model. Note that this is not the posterior predictive distribution of the (future) outcomes, which is what is generated by posterior_predict and yields a matrix of 0s and 1s that are predictions for the observable outcomes.

joeHoover · January 23, 2018, 12:11am

Ah, sorry, I should have looked more closely at my example. I am indeed using family = binomial. And, thank you for clarifying the difference between posterior_linpred and posterior_predict.

I didn’t realize that posterior_predict() generated predictions on the scale of the response, so I was unknowingly transforming 1s and 0s with the inverse logit.

Thanks!

Topic		Replies	Views
Use Posterior_predict in rstanarm to generate probabilites for each observation in a logistic regression model Modeling	6	1146	February 22, 2019
Posterior prediction from moments of parameter coefficients Modeling	13	976	February 27, 2018
Distribution of predictions using posterior_predict() vs "by hand" Modeling techniques	2	562	December 21, 2021
Stan_glmer Marginal Effects Modeling	8	1363	April 12, 2020
Odds ratio and querying posterior with different priors with stan_glm function rstanarm biology , bioinformatics	3	1213	May 30, 2020

Posterior_predict() prediction scale/transformation?

Related Topics