Question about multilevel logistic model (mixed intercept logistic model) in Stan code

I’m trying to write stan code for multilevel logistic regression. The model that I tried is a mixed intercept logistic model with two predictors. The first level is children level and the second level is mom level. When I tried to match the summary result of the code I wrote versus the one generated by function stan_glmer() , the results of fixed intercept did not match. First, the data I used as below:


data(guImmun, package = "mlmRev")
guImmun <- guImmun %>%
  mutate(immun = ifelse(immun == "N",0,1))

Second, the stan code was written as below:

data {
    int N; // number of obs 
    int M; // number of groups 
    int K; // number of predictors
    int y[N]; // outcome
    row_vector[K] x[N]; // predictors
    int g[N];    // map obs to groups (kids to women)
parameters {
    real alpha;
    real a[M]; 
    vector[K] beta;
    real<lower=0,upper=10> sigma;  
model {
  alpha ~ normal(0,1);
  a ~ normal(0,sigma);
  beta ~ normal(0,1);
  for(n in 1:N) {
    y[n] ~ bernoulli(inv_logit( alpha + a[g[n]] + x[n]*beta));

Fitting data to the model:

guI_data <- list(g=as.integer(guImmun$mom),
                x=data.frame(guImmun$kid2p, guImmun$mom25p),
ranIntFit <- stan(file = "first_model.stan", data = guI_data,
                  iter = 500, chains = 1)
summary(ranIntFit, pars = c("alpha", "beta", "a[1]", "a[2]", "a[3]", "sigma"),
                          probs = c(0.025, 0.975),
                          digits = 2)

However, if I use stan_glmer() function, the result would be presented as follows.

M1_stanglmer <- stan_glmer(immun ~ kid2p + mom25p + (1 | mom), 
                           family = binomial("logit"), 
                           data = guImmun,
                           iter = 500,
                           chains = 1,
                           seed = 349)
print(M1_stanglmer, digits = 2)

But the results do not match, especially the result of fixed intercept.
Could anyone help me figure out what’s wrong with my code? Thanks!

I don’t use stan_glmer() regularly. My first guess is that the default priors are different than what you are using – which depending on how informative your data is may result in noticeably different posteriors. I think there is probably a way to extract the automatically generated stan code and compare it to your own.

Thank you very much for replying. What prior do you recommend to use?

There is a lot of literature on prior selection and I don’t want to embarrass myself by trying to summarize it in a comment here! I would recommend these works and associated references as an introduction to the applied modeling philosophies typically found on this forum:

You can also take a look at the stan_glmer() documentation to see if they provide a reference for their default priors.

Thank you very much for the summary! I appreciate it. I will look into them.