I have a model that attempts to estimate the residual correlation between two traits using the brms multivariate syntax, where both traits are measured with some error. Testing with some simulated data, the model without measurement errors (or with error specified for one of the traits) samples efficiently, but fails to identify the true correlation (as expected). However, the model with measurement errors struggles to explore the target distribution for the residual correlation (as evidenced by low E-BFMI, low ESS and slow sampling), but gets much closer to the true correlation.
# simulate some instructive fake data with a true high correlation
sigma_true <- matrix(c(1,0.8,0.8,1),2,2)
n <- 500
# noise greater than signal: independent errors between the two traits
measurement_error <-
tibble(a_me = runif(n,0.5,3),
b_me = runif(n,0.5,3))
# true unknown effects
z<-rmvnorm(n,c(0,0),sigma_true)
# estimates are true effects plus measurement errors
y<-z
y[,1]<-y[,1]+rnorm(n,0,sqrt(measurement_error$a_me))
y[,2]<-y[,2]+rnorm(n,0,sqrt(measurement_error$b_me))
data <-
as_tibble(y) %>%
rename(a_obs = V1, b_obs = V2) %>%
bind_cols(as_tibble(z) %>%
rename(a_true = V1, b_true = V2)) %>%
bind_cols(measurement_error)
# the model
bf_a <- bf(a_obs | mi(a_me) ~ 1)
bf_b <- bf(b_obs | mi(b_me) ~ 1)
fit <- brm(bf_a + bf_b + set_rescor(TRUE),
prior = c(prior(normal(0, 0.25), class = Intercept, resp = aobs),
prior(normal(0, 0.25), class = Intercept, resp = bobs),
prior(exponential(2), class = sigma, resp = aobs),
prior(exponential(2), class = sigma, resp = bobs),
prior(lkj(3), class = rescor)),
iter = 6000, warmup = 2000,
control = list(adapt_delta = 0.9, max_treedepth = 15),
data = data, chains = 4, cores = 4, seed = 1)
Is my approach for estimating the residual correlation reasonable? What is it about having error for both outcomes that leads to poor sampling? Any thoughts would be great! Happy to move from brms to stan if neccessary.
- Operating System: Windows 11
- brms Version: 2.20.4