Correlations among random slopes in brms


I have some questions about correlations between random effects and how they are estimated in brms.

I have observed, across several different datasets, that the correlations between random slopes in brms tend to be less extreme than those in models fit with lme4. But, for other parameters, I generally find that results across brms and lme4 (using 1.1-15) are largely much the same. The exception is for correlations of random slopes. In addition, the credibility intervals around these slopes tend to be very large.

I am wondering where this difference is coming from and why it is so pronounced.

Here is an example:


d <-
  fread("curl | tar -xf- --to-stdout *process.csv")

lmer Model:

fit_lmer <- lmer(intimacy ~ time7c + confcw + (time7c + confcw | id), data = d)

which returns:

Linear mixed model fit by REML ['lmerMod']
Formula: intimacy ~ time7c + confcw + (time7c + confcw | id)
   Data: d

REML criterion at convergence: 7830

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-3.00141 -0.67871 -0.01158  0.66980  2.86987 

Random effects:
 Groups   Name        Variance Std.Dev. Corr     
 id       (Intercept) 0.884973 0.94073           
          time7c      0.004685 0.06845  0.20     
          confcw      2.883196 1.69800  0.35 0.99
 Residual             3.584380 1.89325           
Number of obs: 1848, groups:  id, 66

Fixed effects:
            Estimate Std. Error t value
(Intercept)  4.82068    0.12389   38.91
time7c      -0.02793    0.03952   -0.71
confcw      -1.42701    0.24920   -5.73

The correlation of the time7c random effect and the confcw random effect is .99.

Now, here is the same model run using brms:

fit_brm <- brm(intimacy ~ time7c + confcw + (time7c + confcw | id), data = d)

which returns:

Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: intimacy ~ time7c + confcw + (time7c + confcw | id) 
   Data: d (Number of observations: 1848) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Group-Level Effects: 
~id (Number of levels: 66) 
                      Estimate Est.Error l-95% CI u-95% CI
sd(Intercept)             0.97      0.10     0.79     1.17
sd(time7c)                0.07      0.05     0.00     0.17
sd(confcw)                1.74      0.21     1.35     2.20
cor(Intercept,time7c)     0.08      0.44    -0.83     0.86
cor(Intercept,confcw)     0.32      0.14     0.03     0.57
cor(time7c,confcw)        0.34      0.44    -0.70     0.94
                      Eff.Sample Rhat
sd(Intercept)               1127 1.00
sd(time7c)                   754 1.01
sd(confcw)                  1520 1.00
cor(Intercept,time7c)       4187 1.00
cor(Intercept,confcw)       2044 1.00
cor(time7c,confcw)            88 1.06

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept     4.83      0.13     4.57     5.08        838 1.00
time7c       -0.03      0.04    -0.10     0.05       5973 1.00
confcw       -1.42      0.25    -1.91    -0.92       2299 1.00

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     1.90      0.03     1.83     1.96       6146 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

As these results show, the estimated correlation of the time7c and confcw random slopes is .99 in the lmer model, but only .34 in the brms model. I have obtained similar results even for models in which estimates for all other parameters are essentially the same for both packages.

I am trying to understand (a) why these estimates are so different in size, and (b) why the brms credibility interval for this parameter estimate is so wide (ranging from -.70, .94), which I have also noticed in analyses involving other datasets.

Thank you.

  • Operating System: Mac OS X 10.11.6
  • brms Version: brms_2.8.0
  • lmer Version: lme4_1.1-15

Someone else might correct me, but I’ll give this a shot.

Given that you have only 66 groups, the prior distribution for your Bayesian model likely renders the model skeptical of large correlations, shrinking them towards zero. If you had a larger level 2 sample or an even weaker prior, the two models might yield more similar results.

In this particular example, I think the problem is that the mode of the distribution is at the boundary, and as a result lme4 returns this unreasonably high correlation. brms estimates the full posterior distribution
and reports the posterior mean as the point estimate, which is a much more reasonable point estimate in most cases.


@paul.buerkner is correct: When I run the example code, lme (lme4_1.1-21) actually warns me that “boundary (singular) fit: see ?isSingular

(@jackbailey the default prior is lkj(1), so has little impact on posterior.)

1 Like

Thanks so much, @jackbailey, @paul.buerkner, and @matti. Really appreciate the quick responses.

This has been something I have observed in several models with different datasets, including those with larger sample sizes and less extreme correlation estimates from lmer. Does it seem plausible that the issue here (i.e, the mode of the distribution being at the boundary) is also at play with the other cases I have encountered? Or are there are other reasons why there might be such a larger discrepancy in these values (especially when estimates from other parameters are so similar)?

Thanks again.

Thanks for this, that’s good to know. I had downgraded to lme4_1.1-15 after encountering some convergence issues in 1.1-20, which I learned were false alarms due to changes in the default optimizer in later version of the package.