I want to estimate the difference in association between a single predictor and two different outcomes with the same form. As an example take this dataset. One numeric predictor and two binary no/yes outcomes.
exDF.RData (9.6 KB)
Here is the model I propose running, a multivariate bernoulli. Apologies all non R, non tidyverse users
library(brms)
library(posterior)
library(tidyverse)
# load data
load(file = "exDF.RData") # object name exDF
# create multivariate formula, regressing the two outcomes on the single predictor
bform_mvr <- bf(mvbind(binary1, binary2) ~ pred)
# run model in brms
mvrMod <- brm(formula = bform_mvr,
family = bernoulli,
data = exDF,
seed = 1234,
refresh = 0,
chains = 4,
warmup = 1e3,
iter = 3e3)
# extract parameter estimates
draws <- merge_chains(as_draws_df(mvrMod))
# inspect chains
head(draws)
# A draws_df: 6 iterations, 1 chains, and 8 variables
# b_binary1_Intercept b_binary2_Intercept b_binary1_pred b_binary2_pred Intercept_binary1 Intercept_binary2 lprior lp__
# 1 0.45 0.54 -0.19 -0.27 -0.35 -0.59 -3.9 -1865
# 2 0.45 0.54 -0.19 -0.27 -0.35 -0.59 -3.9 -1865
# 3 0.49 0.88 -0.20 -0.34 -0.37 -0.56 -3.9 -1866
# 4 0.93 0.71 -0.29 -0.32 -0.27 -0.62 -3.9 -1866
# 5 0.96 0.51 -0.31 -0.25 -0.33 -0.54 -3.9 -1865
# 6 0.53 0.53 -0.22 -0.29 -0.40 -0.67 -3.9 -1864
So we have an intercept and a b estimate for the predictor for each of the two outcomes, binary1 and binary2. Now what I want to do is estimate whether the effect of pred differs across the two outcomes. Now technically I can do this as I have some markov chains to play with: At each draw from the posterior I could subtract the binary2 b coefficient from the binary1 b coefficient, yielding a difference, then calculate the quantiles for the median and 95% credibility interval of those differences, like so
draws %>%
mutate(diffPred = b_binary1_pred - b_binary2_pred) %>%
select(diffPred, b_binary2_pred, b_binary2_pred) -> diffDF
#
diffDF %>%
reframe(est = quantile(diffPred,
probs = c(0.5, 0.025, 0.975)),
iqr = c("median", "lowCI", "hiCI")) %>%
pivot_wider(names_from = iqr,
values_from = est)
# output
# A tibble: 1 Ă— 3
# median lowCI hiCI
# <dbl> <dbl> <dbl>
# 1 0.00614 -0.120 0.136
# graph the histogram (with skyblue fill as homage to Kruschke)
diffDF %>%
ggplot(mapping = aes(x = diffPred)) +
geom_histogram(fill = "skyblue")
Looks like no real difference.
But what I want to ask the forum is is it kosher STATISTICALLY to estimate the difference in associations of a single predictor between two outcomes of a similar form?
I’ve done interactions many times, but that’s the associative relationship between two (or more) predictors on a single outcome. I am very new to multivariate models.
