Hello,
I have some questions about correlations between random effects and how they are estimated in brms.
I have observed, across several different datasets, that the correlations between random slopes in brms tend to be less extreme than those in models fit with lme4. But, for other parameters, I generally find that results across brms and lme4 (using 1.1-15) are largely much the same. The exception is for correlations of random slopes. In addition, the credibility intervals around these slopes tend to be very large.
I am wondering where this difference is coming from and why it is so pronounced.
Here is an example:
Dataset:
library(data.table)
d <- as.data.frame(
fread("curl http://www.intensivelongitudinal.com/ch5/ch5R.zip | tar -xf- --to-stdout *process.csv")
)
lmer Model:
library(lme4)
fit_lmer <- lmer(intimacy ~ time7c + confcw + (time7c + confcw | id), data = d)
summary(fit_lmer)
which returns:
Linear mixed model fit by REML ['lmerMod']
Formula: intimacy ~ time7c + confcw + (time7c + confcw | id)
Data: d
REML criterion at convergence: 7830
Scaled residuals:
Min 1Q Median 3Q Max
-3.00141 -0.67871 -0.01158 0.66980 2.86987
Random effects:
Groups Name Variance Std.Dev. Corr
id (Intercept) 0.884973 0.94073
time7c 0.004685 0.06845 0.20
confcw 2.883196 1.69800 0.35 0.99
Residual 3.584380 1.89325
Number of obs: 1848, groups: id, 66
Fixed effects:
Estimate Std. Error t value
(Intercept) 4.82068 0.12389 38.91
time7c -0.02793 0.03952 -0.71
confcw -1.42701 0.24920 -5.73
The correlation of the time7c random effect and the confcw random effect is .99.
Now, here is the same model run using brms:
library(brms)
fit_brm <- brm(intimacy ~ time7c + confcw + (time7c + confcw | id), data = d)
summary(fit_brm)
which returns:
Family: gaussian
Links: mu = identity; sigma = identity
Formula: intimacy ~ time7c + confcw + (time7c + confcw | id)
Data: d (Number of observations: 1848)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000
Group-Level Effects:
~id (Number of levels: 66)
Estimate Est.Error l-95% CI u-95% CI
sd(Intercept) 0.97 0.10 0.79 1.17
sd(time7c) 0.07 0.05 0.00 0.17
sd(confcw) 1.74 0.21 1.35 2.20
cor(Intercept,time7c) 0.08 0.44 -0.83 0.86
cor(Intercept,confcw) 0.32 0.14 0.03 0.57
cor(time7c,confcw) 0.34 0.44 -0.70 0.94
Eff.Sample Rhat
sd(Intercept) 1127 1.00
sd(time7c) 754 1.01
sd(confcw) 1520 1.00
cor(Intercept,time7c) 4187 1.00
cor(Intercept,confcw) 2044 1.00
cor(time7c,confcw) 88 1.06
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept 4.83 0.13 4.57 5.08 838 1.00
time7c -0.03 0.04 -0.10 0.05 5973 1.00
confcw -1.42 0.25 -1.91 -0.92 2299 1.00
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma 1.90 0.03 1.83 1.96 6146 1.00
Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
is a crude measure of effective sample size, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
As these results show, the estimated correlation of the time7c
and confcw
random slopes is .99 in the lmer model, but only .34 in the brms model. I have obtained similar results even for models in which estimates for all other parameters are essentially the same for both packages.
I am trying to understand (a) why these estimates are so different in size, and (b) why the brms credibility interval for this parameter estimate is so wide (ranging from -.70, .94), which I have also noticed in analyses involving other datasets.
Thank you.
- Operating System: Mac OS X 10.11.6
- brms Version: brms_2.8.0
- lmer Version: lme4_1.1-15