Some dummy data to try things out:
set.seed(42)
d = data.frame(a = NA,
b = NA,
c = NA,
g = rep(letters,
times = sample(10:30,
26,
TRUE)))
for (i in letters) {
d$a[d$g == i] = rnorm(1)
}
d$b = rnorm(nrow(d), 5 + 3*d$a, 2)
d$c = rnorm(nrow(d), 5 - 3*d$a, 2)
plot(d$b ~ d$a)
plot(d$c ~ d$a)
plot(d$c ~ d$b)
The first two models estimate the relationship of b or c with a, partially pooling deviations of groups from the expected b or c based on their a.
library(brms)
m.b_a = brm(data = d,
formula = bf(b ~ a + (1|g)),
family = "gaussian",
backend = "cmdstanr",
chains = 4,
iter = 4000,
cores = 4)
summary(m.b_a)
m.c_a = brm(data = d,
formula = bf(c ~ a + (1|g)),
family = "gaussian",
backend = "cmdstanr",
chains = 4,
iter = 4000,
cores = 4)
summary(m.c_a)
The third model estimates average b and c across groups. The correlated random intercepts capture the across groups correlation which is indeed ~ -1. The residual correlation is correlation of b and c within groups; they are unrelated.
m.bc = brm(data = d,
formula = bf(c ~ 1 + (1|q|g))+
bf(b ~ 1 + (1|q|g))+
set_rescor(TRUE),
family = "gaussian",
backend = "cmdstanr",
chains = 4,
iter = 4000,
cores = 4)
summary(m.bc)
Multilevel Hyperparameters:
~g (Number of levels: 26)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(c_Intercept) 3.38 0.47 2.60 4.40 1.00 1518 2802
sd(b_Intercept) 3.15 0.43 2.43 4.14 1.00 1529 2813
cor(c_Intercept,b_Intercept) -0.99 0.01 -1.00 -0.96 1.00 2948 4005
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
c_Intercept 5.03 0.64 3.78 6.32 1.00 630 1312
b_Intercept 4.78 0.60 3.58 5.94 1.00 649 1449
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_c 2.11 0.07 1.98 2.25 1.00 11661 6128
sigma_b 2.10 0.07 1.97 2.24 1.00 12944 5789
Residual Correlations:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
rescor(c,b) 0.01 0.05 -0.08 0.11 1.00 11481 6098
The last model estimates the relationship of c with b across groups. So the -1 estimate has now moved from the multilevel part of the model to the coefficient for b on c. The residual within group correlation is now positive (which I do not understand).
m.c_b = brm(data = d,
formula = bf(c ~ 1 + b + (1|q|g))+
bf(b ~ 1 + (1|q|g))+
set_rescor(TRUE),
family = "gaussian",
backend = "cmdstanr",
chains = 4,
iter = 4000,
control = list(adapt_delta = 0.95),
cores = 4)
summary(m.c_b)
Multilevel Hyperparameters:
~g (Number of levels: 26)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(c_Intercept) 0.63 0.80 0.02 3.02 1.00 858 721
sd(b_Intercept) 3.28 0.49 2.47 4.40 1.00 3783 5005
cor(c_Intercept,b_Intercept) -0.07 0.70 -1.00 0.98 1.02 162 969
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
c_Intercept 9.81 1.43 5.84 11.75 1.01 213 593
b_Intercept 4.78 0.65 3.51 6.03 1.00 3933 4660
c_b -1.00 0.30 -1.40 -0.16 1.01 203 598
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_c 3.04 0.34 2.21 3.67 1.01 245 739
sigma_b 2.10 0.07 1.96 2.24 1.00 14054 6047
Residual Correlations:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
rescor(c,b) 0.68 0.18 0.15 0.82 1.01 243 567
But overall, I think this model does what I originally thought that bf(b ~ c + (1|group))
would do.
Can I have a “yay” or “nay you fool, that’s not how any of this works”?