Hi everyone,
I’m having trouble figuring out how to “properly” set an informative prior for a regular bivariate correlation model. I have a proposed solution, but I’m not sure whether it makes sense or not, and would greatly appreciate any input.
To get going, let’s simulate 50 data points with a known correlation of 0.2:
library(MASS)
library(tidyverse)
library(brms)
set.seed(2021)
data <- as.data.frame(mvrnorm(50, mu = c(0,0),
Sigma = matrix(c(1, 0.2, 0.2, 1), ncol = 2),
empirical = TRUE)) %>%
mutate(x = V1, y = V2) %>%
dplyr::select(x, y)
cor(data$x, data$y)
Ordinarily, I would approach this using mvbind
and an LKJ
prior:
m0 <- brm(data = data,
family = gaussian,
bf(mvbind(x, y) ~ 0) + set_rescor(TRUE),
prior = c(prior(normal(0, 1), class = sigma, resp = x),
prior(normal(0, 1), class = sigma, resp = y),
prior(lkj(1), class = rescor)),
sample_prior = "only",
control = list(adapt_delta = 0.9),
iter = 4000, warmup = 500, chains = 4, cores = 4, seed = seed)
But since the data is standardized, using a linear regression model with a suppressed intercept should yield an equivalent estimate of the correlation, right? And if so, a uniform(-1, 1)
should be equivalent or at least very similar to the mvbind
+ LKJ(1)
prior approach, like this:
m1 <- brm(data = data,
family = gaussian,
y ~ 0 + x,
prior = c(prior(uniform(-1, 1), class = b, lb = -1, ub = 1),
prior(normal(0, 1), class = sigma)),
sample_prior = "only",
control = list(adapt_delta = 0.9),
iter = 4000, warmup = 500, chains = 4, cores = 4, seed = seed)
In practice however I usually use the LKJ(2)
prior to rule out extreme values:
m2 <- brm(data = data,
family = gaussian,
bf(mvbind(x, y) ~ 0) + set_rescor(TRUE),
prior = c(prior(normal(0, 1), class = sigma, resp = x),
prior(normal(0, 1), class = sigma, resp = y),
prior(lkj(2), class = rescor)),
sample_prior = "only",
control = list(adapt_delta = 0.9),
iter = 4000, warmup = 500, chains = 4, cores = 4, seed = seed)
Here the problems start, since I cannot use an LKJ(2)
in the linear regression approach. The closest would be something like a Beta
, but that’s bounded between 0 and 1. I experimented with different truncated normals, such as a normal(0, 0.5)
:
m3 <- brm(data = data,
family = gaussian,
y ~ 0 + x,
prior = c(prior(normal(0, 0.5), class = b, lb = -1, ub = 1),
prior(normal(0, 1), class = sigma)),
sample_prior = "only",
control = list(adapt_delta = 0.9),
iter = 4000, warmup = 500, chains = 4, cores = 4, seed = seed)
It’s fairly close, but there are slight differences in where most of the probability mass lands. Finally, I extended this to a form of informative prior, using a normal(0.2, 0.5)
:
m4 <- brm(data = data,
family = gaussian,
y ~ 0 + x,
prior = c(prior(normal(0.2, 0.5), lb = -1, ub = 1, class = b),
prior(normal(0, 1), class = sigma)),
sample_prior = "only",
control = list(adapt_delta = 0.9),
iter = 4000, warmup = 500, chains = 4, cores = 4, seed = seed)
Here’s how the prior distributions look:
And here’s how the posterior distributions look:
Overall, very similar (at least for this simulated, well-behaved data set).
My questions boil down to:
- Am I correct in assuming equivalence between the “regular”
mvbind(x, y)
approach and the linear regressiony ~ 0 + x
approach, as long as data are standardized? - If the above is correct, then is the use of truncated normals (as a substitute for the
LKJ
prior and as a way of injecting prior knowledge) reasonable? - Are there alternative approaches that might be better suited?
- Operating System: Debian 11
- brms Version: 2.16.1