Question about multilevel logistic model (mixed intercept logistic model) in Stan code

Doria · June 7, 2021, 7:02pm

I’m trying to write stan code for multilevel logistic regression. The model that I tried is a mixed intercept logistic model with two predictors. The first level is children level and the second level is mom level. When I tried to match the summary result of the code I wrote versus the one generated by function stan_glmer() , the results of fixed intercept did not match. First, the data I used as below:

library(rstanarm)
library(rstan)

data(guImmun, package = "mlmRev")
summary(guImmun)
require(dplyr)
guImmun <- guImmun %>%
  mutate(immun = ifelse(immun == "N",0,1))

Second, the stan code was written as below:

data {
    int N; // number of obs 
    int M; // number of groups 
    int K; // number of predictors
    
    int y[N]; // outcome
    row_vector[K] x[N]; // predictors
    int g[N];    // map obs to groups (kids to women)
}
parameters {
    real alpha;
    real a[M]; 
    vector[K] beta;
    real<lower=0,upper=10> sigma;  
}
model {
  alpha ~ normal(0,1);
  a ~ normal(0,sigma);
  beta ~ normal(0,1);
  for(n in 1:N) {
    y[n] ~ bernoulli(inv_logit( alpha + a[g[n]] + x[n]*beta));
  }
}

Fitting data to the model:

guI_data <- list(g=as.integer(guImmun$mom),
                y=guImmun$immun,
                x=data.frame(guImmun$kid2p, guImmun$mom25p),
                N=nrow(guImmun),
                K=2,
                M=nlevels(guImmun$mom))
ranIntFit <- stan(file = "first_model.stan", data = guI_data,
                  iter = 500, chains = 1)
summary(ranIntFit, pars = c("alpha", "beta", "a[1]", "a[2]", "a[3]", "sigma"),
                          probs = c(0.025, 0.975),
                          digits = 2)

However, if I use stan_glmer() function, the result would be presented as follows.

M1_stanglmer <- stan_glmer(immun ~ kid2p + mom25p + (1 | mom), 
                           family = binomial("logit"), 
                           data = guImmun,
                           iter = 500,
                           chains = 1,
                           seed = 349)
print(M1_stanglmer, digits = 2)

But the results do not match, especially the result of fixed intercept.
Could anyone help me figure out what’s wrong with my code? Thanks!

js592 · June 7, 2021, 8:06pm

I don’t use stan_glmer() regularly. My first guess is that the default priors are different than what you are using – which depending on how informative your data is may result in noticeably different posteriors. I think there is probably a way to extract the automatically generated stan code and compare it to your own.

Doria · June 7, 2021, 10:12pm

Thank you very much for replying. What prior do you recommend to use?

js592 · June 7, 2021, 11:12pm

There is a lot of literature on prior selection and I don’t want to embarrass myself by trying to summarize it in a comment here! I would recommend these works and associated references as an introduction to the applied modeling philosophies typically found on this forum:

http://www.stat.columbia.edu/~gelman/research/published/entropy-19-00555-v2.pdf
Towards A Principled Bayesian Workflow (sections 2 and 4)
https://arxiv.org/pdf/2011.01808.pdf (section 2)

You can also take a look at the stan_glmer() documentation to see if they provide a reference for their default priors.

Doria · June 7, 2021, 11:19pm

Thank you very much for the summary! I appreciate it. I will look into them.

Topic		Replies	Views
Multilevel model with two predictors and their interaction Modeling rstan , specification	3	105	October 2, 2024
Understanding Multilevel modelling STAN code Modeling	2	177	April 26, 2024
Posterior predictive checks of a Multilevel regression model Modeling	1	704	November 27, 2017
Custom hierarchical logistic reg has larger std errors than stan_glmer results rstanarm	3	663	November 15, 2019
Help with simple multilevel model code [solved] Modeling specification	5	1208	February 10, 2018

Question about multilevel logistic model (mixed intercept logistic model) in Stan code

Related topics