The results of VarCorr
should AFAIK be the fitted covariance/correlation matrix for group-level effects. In a model where those effects are the only term, one would expect to get exactly the same covariance/correlation structure by computing the linear predictor for a large number of datapoints for each sample and computing the empirical correlation in each sample (i.e. the predictors should be exactly draws from the relevant multivariate normal, so taking many such draws should recover the correlation covariance). This however gives different numbers in practice.
So maybe I am missing something? Or do I have a bug?
Code to reproduce this:
library(brms)
library(tidyverse)
options(mc.cores = parallel::detectCores(), brms.backend = "cmdstanr")
set.seed(58232134)
N_ids <- 20
N_groups <- 3
test_data <- crossing(data.frame(id = paste0("id",1:N_ids), idmod = rnorm(N_ids)),
data.frame(group = paste0("g", 1:N_groups), groupmod = rnorm(N_groups)),
data.frame(rep = c(1:3))) %>%
mutate(response = rnorm(n(), groupmod + idmod))
fit <- brm(response ~ 0 + (0 + group | id), data = test_data)
VarCorr(fit)$id$cor[,"Estimate", ]
Gives:
groupg1 groupg2 groupg3
groupg1 1.0000000 0.3642701 0.9348216
groupg2 0.3642701 1.0000000 0.4477652
groupg3 0.9348216 0.4477652 1.0000000
Now, computing via the linear predictor:
all_groups <- unique(test_data$group)
n_reps <- 10000
prediction_data <- crossing(id = paste0("new_id", 1:n_reps), group = all_groups)
linpred <- posterior_linpred(fit, newdata = prediction_data, allow_new_levels = TRUE)
n_samples <- dim(linpred)[1]
per_sample_cor <- array(NA_real_, c(n_samples, length(all_groups), length(all_groups) ))
for(s in 1:n_samples) {
predictors_wide <- array(NA_real_, c(n_reps, length(all_groups)))
for(g in 1:length(all_groups)) {
predictors_wide[ , g] <- linpred[s, prediction_data$group == all_groups[g]]
}
per_sample_cor[s, , ] <- cor(predictors_wide)
}
apply(per_sample_cor, MARGIN = c(2,3), FUN = mean)
which gives:
[,1] [,2] [,3]
[1,] 1.0000000 0.4954658 0.8642596
[2,] 0.4954658 1.0000000 0.6006337
[3,] 0.8642596 0.6006337 1.0000000
Any ideas, what is causing the difference?