I am using a multivariate specification in brms to model two correlated outcomes, (y_1, y_2) jointly. However, the outcome of interest is actually y_1/y_2, which is calculated after model run. I am interested in estimating the relationship between a covariate, c_1 on the ratio y_1/y_2. I have added this covariate into the multivariate model for y_1, and y_2, which results in two coefficients: y_1\_c_1y_2\_c_1. How can I use this information to estimate y_1/y_2\_c_1?
Unfortunately, I cannot share my data here, however, I have built an example which builds off of Paul Bürkner’s vingette on multivariate models. Not sure that it maps to any outcome of interest in reality, but the data provide a vehicle to describe my goal.
## Read in the data
data("BTdata", package = "MCMCglmm")
## Model for y1 (tarsus)
m1 <- bf(tarsus ~ hatchdate + (1 | p | fosternest))
## Model for y2 (back)
m2 <- bf(back ~ hatchdate + (1 | p | fosternest))
## Fit the model
fit <- brm(
m1 + m2 + set_rescor(FALSE),
data = BTdata, chains = 2, cores = 2,
control = list(adapt_delta = 0.95)
)
## Print model summary
summary(fit)
## Add draws
estimates <- add_epred_draws(object = fit,
newdata=BTdata)
## Calculate the ratio of tarsus to back
estimateSummary <- estimates %>%
pivot_wider(names_from = .category,
names_prefix = "mod_",
values_from = .epred) %>%
mutate(ratio = mod_tarsus / mod_back) %>%
ungroup() %>%
mean_qi(ratio, na.rm=T)
### How to estimate the relationship between hatchdate on ratio?
One option would be to (1) create a counterfactual dataset with different values of your treatment variable, (2) predict the outcome for each row under both counterfactual conditions, (3) aggregate draws from the posterior distribution to compute the quantity of interest.
Here’s an example using the marginaleffects package. It may not be exactly what you need, but perhaps it gives you some ideas:
library(dplyr)
library(marginaleffects)
# create counterfactual data
nd = transform(BTdata, id = 1:nrow(BTdata), treatment = "low")
nd = rbind(nd, transform(nd, treatment = "high", hatchdate = hatchdate + 1))
# predict both outcomes
predictions(fit, newdata = nd) |>
# extract draws from posterior distribution
get_draws() |>
# manipulate predictions to compute the difference in average ratios
summarize(ratio = draw[2] / draw[1], .by = c(id, drawid, treatment)) |>
summarize(avg_ratio = mean(ratio, .by = treatment)) |>
summarize(change_in_avg_ratio = mean(avg_ratio))
Thank you so much for this well written answer. But my only concern is the fact that this is using the draws directly from the posterior predictive distribution, not the expectation of the posterior predictive distribution. Since I’m interested in the marginal effect (the average impact across all levels of the random effect, in this case fosternest), wouldn’t I need to use a function that accounts for that? I think that avg_predictions handles marginal effects, so I’d sub that in for predictions and then proceed as you’ve written.
I hadn’t looked into marginaleffects package until this post, so thank you!
When both back and tarsus are linearly related to hatchdate, then except in very special cases it will not be true that the ratio back / tarsus is linearly related to hatchdate. Moreover the influence of hatchdate will depend on the value of the random effect term, and it will depend nonlinearly on the value of the random effect term. To make a statement about how much the back / tarsus ratio varies due to a change in hatchdate, you need to choose a value of hatchdate where you are interested in making the statement, and then you need to choose if you want an average value across all levels of the random effect (weighted or unweighted by their frequency in the data), or something else.