Predicted Probabilies with Multilevel Logit Models

Dear all Stan users,

Hope things are going well with you all. I am posting to ask about post-estimation analysis of a two-level logit model after running stan_glmer. Below y is a binary response variable, x1-x4 are individual level covariates, and x5 is a level-2 predictor. The level-1 model is specified as
y = x0 + x1 + x2 + x3 + x4 + e, wherein x0 is a unit column for intercept and e is the error term
and the level-2 models include
b0 = g00 + x5 + u00
b2 = g20 + x5 + u20
where g00 and g20 are unit columns at level-2 and u’s are level-2 error terms.

stanmod01 <-stan_glmer(y ~ x1 + x2 + x3 + x4 + x5 + x5*x2
              + (1 + x2 | group), 
              data = mydata, family = binomial, seed = 123456789)

So after I obtain the results, I want to get the predicted probability of y = 1 given a x vector. So first I create a x vector as follow,

x.vector <- data.frame(x1 = 3.5,  x2 = 0, x3 = 0, x4 = 0, x5 = 30, group  = 23019)

Then I use the posterior_predict to calculate the prediction,

predP = posterior_predict(stanmod01, newdata=x.vector)

I can get predP. But what is in predP are one’s and zero’s, not predicted probabilities. So my first question is how I can get the generated posterior distribution of predicted probabilities given a x vector. My second question is if the ones and zeros are random draws based on the predicted probabilities. Thank you all!

Jun Xu, PhD
Department of Sociology
Ball State University

I think I’ve figured out. It looks like I need to use the posterior_linpred and posterior_epred functions instead of the posterior_predict function. Will update you all if there is any new finding. Thank you all!


1 Like