Hi noticed some clear differences between glmer and stan_glmer. The difference is in the estimated realizations of the “random effects” - in glmer I get values that appear to be a weighted average between grand mean and group-specific mean (as suggested by Gelman and Hill, 2007: 253). When I use stan_glmer I get very small (factor of 5 or 10 times smaller) estimated realizations.
Originally, I thought it might be because of the small sample size and that the prior is pushing the estimates towards 0. But I now set up a reproducible example with a large sample and the effect remains.
I would appreciate any suggestions or comments that would help me understand why there is a difference and potentially how to change the stan_glmer command to get estimates that would provide more sizable estimates for the realizations of the “random effects”.
Below is some code that allows to reproduce the issue (takes about an hour to run on my small MacBook and provides no warning messages at the end. All Rhat values are 1.0.
rm(list=ls()) library(lme4) library(blme) library(rstanarm) library(glmer2stan) options(mc.cores = parallel::detectCores()) # Set up a MC analysis # 8008 indidivuals split across 25 level 2 units n1 <- 8000 n2 <- 25 n1.in.2 <- n1/n2 # Equal number of individuals in level 2 units err.ind <- rnorm(n1) # normally distributed error across individuals err.cont <- rnorm(n2) # normally distributed error across level 2 units # equal-sized clusters! unit <- kronecker(c(1:n2),rep(1,n1.in.2)) # level 2 unit indicator err.cont <- kronecker(err.cont,rep(1,n1.in.2)) # creating level 2 error vector with length n1 # generating two categorial features (one with four groups and one with 15 groups) feature1.nr <- sample(c(1:4),n1,replace = TRUE) feature2.nr <- sample(c(1:15),n1,replace = TRUE) # Now, drawing their group-specific "effects" feature1.vec <- rnorm(4,0,.3) feature2.vec <- rnorm(15,0,.5) # Assigning each observations its feature value feature1.x <- feature1.vec[feature1.nr] feature2.x <- feature2.vec[feature2.nr] y.star <- 0.3 + feature1.x + feature2.x + err.cont + err.ind y <- rep(0,length(y.star)) y[y.star>0] <- 1 data.test <- data.frame(y,feature1.nr,feature2.nr,unit) # Relying on glmer to estimate mixed model ("random effects" modeled for each feature) m.glmer <- glmer(y ~ 1 + (1|feature1.nr) + (1|feature2.nr) + (1|unit), family=binomial("probit"), data=data.test) summary(m.glmer) ranef(m.glmer) startL <- Sys.time() m.stan <- stan_glmer(y ~ 1 + (1|feature1.nr) + (1|feature2.nr) + (1|unit), data=data.test, family = binomial("probit"), chains = 4, cores = 2, iter = 2000, warmup = 800, adapt_delta = 0.99)#, #prior_covariance = decov(.1, .1, shape=1, scale=1)) endL <- Sys.time() endL-startL summary(m.stan) # looking at feature 1 "random effect" realizations ranef(m.glmer)$feature1.nr ranef(m.stan)$feature1.nr feature1.vec # median absolute factor of glmer vs stan_glmer median(abs(c(ranef(m.glmer)$feature1.nr/ranef(m.stan)$feature1.nr)$`(Intercept)`)) # looking at feature 2 "random effect" realizations ranef(m.glmer)$feature2.nr ranef(m.stan)$feature2.nr feature2.vec # median absolute factor of glmer vs stan_glmer median(abs(c(ranef(m.glmer)$feature2.nr/ranef(m.stan)$feature2.nr)$`(Intercept)`))