Well, this has remained a thorn in my side. I’ve begun some simulations to try to evaluate performance. The code for the simulation is below. It is not the most elegant. In short, I created a synthetic population of 10,000 people with a random intercept and random slope. Then drew random samples of 100 people from this and fit models, and repeated this 2,000 times. The synthetic population was drawn with an average slope of 2.
Using default priors in brms
we would expect the 95% credible interval to include 2 about 95% of the time.
I use brms
population table for the brms
estimates and intervals and then the average marginal effect, calculated using fitted()
. The point estimates are largely comparable with minimal average bias, average absolute differences. The population table from brms
intervals included the value “2” 97.5% of the time whereas the averages from fitted()
did 96.4% of the time, and expectedly the interval width was narrower (1.84) for the marginal estimates from fitted()
compared to brms
(1.98).
> test[, Correct := LL <= 2 & UL >= 2]
> test[, mean(Est - 2), by = Type] ## average bias
Type V1
1: brms 0.02351602
2: marginal 0.02178421
> test[, mean(abs(Est - 2)), by = Type] ## average absolute differene
Type V1
1: brms 0.3449325
2: marginal 0.3453282
> test[, mean(Correct), by = Type] ## percent of 95% CI intervals including pop value
Type V1
1: brms 0.9750
2: marginal 0.9635
> test[, mean(UL - LL), by = Type] ## mean CI width
Type V1
1: brms 1.983567
2: marginal 1.835438
I remain confused about why the CI width is narrower when using fitted()
, but at least under this one very restrictive conditions, the approach seems to be performing adequately.
Because I only created a synthetic population with 10,000 people, the “true” value in that synthetic population was actually 2.018554, not 2. Using the “population” value from the synthetic population yielded largely similar results.
> test[, Correct := LL <= 2.018554 & UL >= 2.018554]
> test[, mean(Est - 2.018554), by = Type]
Type V1
1: brms 0.004962019
2: marginal 0.003230207
> test[, mean(abs(Est - 2.018554)), by = Type]
Type V1
1: brms 0.3442757
2: marginal 0.3447688
> test[, mean(Correct), by = Type]
Type V1
1: brms 0.9760
2: marginal 0.9635
> test[, mean(UL - LL), by = Type]
Type V1
1: brms 1.983567
2: marginal 1.835438
Here is the code for the simulation. I aim to run some more conditions and will include some of the findings in the brmsmargins
package vignettes / README.
library(distr6)
library(brms)
library(brmsmargins)
library(data.table)
## library(lme4)
library(future)
library(doFuture)
library(foreach)
library(doRNG)
n.b0 <- distr6::Normal$new(mean = 50, sd = 10)
n.b1 <- distr6::Normal$new(mean = 2, sd = 3)
set.seed(1234)
x.b0 <- n.b0$rand(10000)
x.b1 <- n.b1$rand(10000)
mu.0 <- x.b0 + x.b1 * 0
mu.1 <- x.b0 + x.b1 * 1
n <- distr6::Normal$new(mean = 0, sd = 5)
set.seed(1234)
y.0a <- n$rand(10000) + mu.0
y.0b <- n$rand(10000) + mu.0
y.1a <- n$rand(10000) + mu.1
y.1b <- n$rand(10000) + mu.1
d <- data.table(
ID = rep(1:10000, times = 4),
y = c(y.0a, y.0b, y.1a, y.1b),
x = rep(c(0, 0, 1, 1), each = 10000))
d[, .(Diff = mean(y[x == 1]) - mean(y[x == 0])), by = ID][, .(MDiff = mean(Diff))]
## m0 <- lme4::lmer(y ~ x + (1 + x | ID), data = d)
## summary(m0)
set.seed(1234)
IDs <- sample(1:10000, size = 100, replace = FALSE)
m1 <- brm(y ~ x + (1 + x | ID),
data = d[ID %in% IDs], chains = 2, cores = 2,
iter = 4000, refresh = 0,
backend = "cmdstanr")
summary(m1)
ame1 <- brmsmargins(m1,
add = data.frame(x = c(0, 1)),
CI = .95, CIType = "ETI",
contrasts = matrix(c(-1, 1)),
subset = "!duplicated(ID)")
ame1$ContrastSummary
registerDoFuture()
plan(multisession, workers = 20)
set.seed(12345)
res <- foreach(i = 1:2000, .combine = c) %dorng% {
IDs <- sample(1:10000, size = 100, replace = FALSE)
mfit <- update(m1, newdata = d[ID %in% IDs], refresh = 0, silent = 2)
ame <- brmsmargins(mfit,
add = data.frame(x = c(0, 1)),
CI = .95, CIType = "ETI",
contrasts = matrix(c(-1, 1)),
subset = "!duplicated(ID)")
fe.use <- as.data.table(as.list(fixef(mfit)[2, c(1, 3, 4)]))
ame.use <- as.data.table(ame$ContrastSummary[, c("Mdn", "LL", "UL")])
colnames(fe.use) <- colnames(ame.use) <- c("Est", "LL", "UL")
list(
rbind(
cbind(Type = "brms", run = i, fe.use),
cbind(Type = "marginal", run = i, ame.use)))
}
test <- do.call(rbind, res)