Hi noticed some clear differences between glmer and stan_glmer. The difference is in the estimated realizations of the “random effects” - in glmer I get values that appear to be a weighted average between grand mean and group-specific mean (as suggested by Gelman and Hill, 2007: 253). When I use stan_glmer I get very small (factor of 5 or 10 times smaller) estimated realizations.
Originally, I thought it might be because of the small sample size and that the prior is pushing the estimates towards 0. But I now set up a reproducible example with a large sample and the effect remains.
I would appreciate any suggestions or comments that would help me understand why there is a difference and potentially how to change the stan_glmer command to get estimates that would provide more sizable estimates for the realizations of the “random effects”.
Below is some code that allows to reproduce the issue (takes about an hour to run on my small MacBook and provides no warning messages at the end. All Rhat values are 1.0.
Best,
Lucas
rm(list=ls())
library(lme4)
library(blme)
library(rstanarm)
library(glmer2stan)
options(mc.cores = parallel::detectCores())
# Set up a MC analysis
# 8008 indidivuals split across 25 level 2 units
n1 <- 8000
n2 <- 25
n1.in.2 <- n1/n2 # Equal number of individuals in level 2 units
err.ind <- rnorm(n1) # normally distributed error across individuals
err.cont <- rnorm(n2) # normally distributed error across level 2 units
# equal-sized clusters!
unit <- kronecker(c(1:n2),rep(1,n1.in.2)) # level 2 unit indicator
err.cont <- kronecker(err.cont,rep(1,n1.in.2)) # creating level 2 error vector with length n1
# generating two categorial features (one with four groups and one with 15 groups)
feature1.nr <- sample(c(1:4),n1,replace = TRUE)
feature2.nr <- sample(c(1:15),n1,replace = TRUE)
# Now, drawing their group-specific "effects"
feature1.vec <- rnorm(4,0,.3)
feature2.vec <- rnorm(15,0,.5)
# Assigning each observations its feature value
feature1.x <- feature1.vec[feature1.nr]
feature2.x <- feature2.vec[feature2.nr]
y.star <- 0.3 + feature1.x + feature2.x + err.cont + err.ind
y <- rep(0,length(y.star))
y[y.star>0] <- 1
data.test <- data.frame(y,feature1.nr,feature2.nr,unit)
# Relying on glmer to estimate mixed model ("random effects" modeled for each feature)
m.glmer <- glmer(y ~ 1 + (1|feature1.nr) + (1|feature2.nr) + (1|unit), family=binomial("probit"), data=data.test)
summary(m.glmer)
ranef(m.glmer)
startL <- Sys.time()
m.stan <- stan_glmer(y ~ 1 + (1|feature1.nr) + (1|feature2.nr) + (1|unit),
data=data.test,
family = binomial("probit"),
chains = 4,
cores = 2,
iter = 2000,
warmup = 800,
adapt_delta = 0.99)#,
#prior_covariance = decov(.1, .1, shape=1, scale=1))
endL <- Sys.time()
endL-startL
summary(m.stan)
# looking at feature 1 "random effect" realizations
ranef(m.glmer)$feature1.nr
ranef(m.stan)$feature1.nr
feature1.vec
# median absolute factor of glmer vs stan_glmer
median(abs(c(ranef(m.glmer)$feature1.nr/ranef(m.stan)$feature1.nr)$`(Intercept)`))
# looking at feature 2 "random effect" realizations
ranef(m.glmer)$feature2.nr
ranef(m.stan)$feature2.nr
feature2.vec
# median absolute factor of glmer vs stan_glmer
median(abs(c(ranef(m.glmer)$feature2.nr/ranef(m.stan)$feature2.nr)$`(Intercept)`))