I think this is a conceptual issue on my end, but it also could be an issue related to fit()
.
I am working on a large, individual participant data meta-analysis and want to estimate constrained longitudinal analyses with random slopes by trial.
Some outcomes are continuous, so the default posterior summaries of coefficients estimated from brms
are great. Other outcomes are count, so I wanted to calculate average marginal effects (AMEs) for change from baseline to post intervention in control and treatment conditions, and their difference.
For simplicity, suppose the goal was the AME for the change from baseline to post in the control group. My naive approach was to take the model frame, and then generate predictions on the response scale under some counterfactual conditions:
- set all observations as in the control group at baseline
- set all observations as in the control group at post
- calculate the difference between those two sets of predictions, average across participants, and then summarize
Here is a synthetic example using the small mtcars
dataset built into R
.
dummy <- mtcars
dummy$Time <- mtcars$am
dummy$Tx <- mtcars$vs
m1 <- brm(mpg ~ Time + Time:Tx + (1 | cyl), data = dummy, cores = 4)
mf <- model.frame(m1)
mf <- mf
mf$Tx <- 0
mf$Time <- 0
yhat00 <- fitted(m1, newdata = mf, scale = "response",
re_formula = NULL, summary = FALSE)
mf$Time <- 1
yhat01 <- fitted(m1, newdata = mf, scale = "response",
re_formula = NULL, summary = FALSE)
> bayestestR::describe_posterior(m1)
Summary of Posterior Distribution
Parameter | Median | 95% CI | pd | ROPE | % in ROPE | Rhat | ESS
--------------------------------------------------------------------------------------------
(Intercept) | 18.70 | [13.63, 23.93] | 100% | [-0.60, 0.60] | 0% | 1.000 | 1278.00
Time | 1.37 | [-1.48, 4.27] | 83.15% | [-0.60, 0.60] | 21.10% | 1.001 | 2538.00
Time:Tx | 4.24 | [ 0.01, 8.76] | 97.28% | [-0.60, 0.60] | 2.32% | |
> bayestestR::describe_posterior(rowMeans(yhat01 - yhat00))
Summary of Posterior Distribution
Parameter | Median | 95% CI | pd | ROPE | % in ROPE
-----------------------------------------------------------------------
Posterior | 1.37 | [-1.48, 4.27] | 83.15% | [-0.10, 0.10] | 3.47%
With a random intercept only, the approach appears fine. In the linear case, the estimate from my differences is 1.37, which matches the Time population parameter of 1.37.
When there is a random slope, however, these no longer align.
m2 <- brm(mpg ~ Time + Time:Tx + (1 + Time + Time:Tx | cyl), data = dummy, cores = 4)
mf <- model.frame(m2)
mf <- mf
mf$Tx <- 0
mf$Time <- 0
yhat00 <- fitted(m2, newdata = mf, scale = "response",
re_formula = NULL, summary = FALSE)
mf$Time <- 1
yhat01 <- fitted(m2, newdata = mf, scale = "response",
re_formula = NULL, summary = FALSE)
> bayestestR::describe_posterior(m2)
Summary of Posterior Distribution
Parameter | Median | 95% CI | pd | ROPE | % in ROPE | Rhat | ESS
---------------------------------------------------------------------------------------------
(Intercept) | 18.61 | [ 13.57, 24.39] | 99.98% | [-0.60, 0.60] | 0% | 1.000 | 1240.00
Time | 1.53 | [ -3.00, 6.12] | 78.05% | [-0.60, 0.60] | 19.44% | 1.001 | 1031.00
Time:Tx | 3.45 | [-15.57, 15.43] | 75.45% | [-0.60, 0.60] | 6.50% | |
> bayestestR::describe_posterior(rowMeans(yhat01 - yhat00))
Summary of Posterior Distribution
Parameter | Median | 95% CI | pd | ROPE | % in ROPE
-----------------------------------------------------------------------
Posterior | 1.46 | [-1.50, 4.79] | 82.85% | [-0.10, 0.10] | 3.50%
1.46 is not 1.53. What concerns me more than these minor variations in the central tendency, is that my approach of extract fitted values, average across people, and then summarize the posterior seems to generate consistently too narrow CIs when there is a random slope involved.
Obviously with a linear model and identity link, what I am doing is pointless. But for my other outcomes using non linear link functions, the AMEs will differ from the parameter estimates.
I think I have a conceptual issue in how I define / calculate the AMEs, but searching the general internet and google scholar for methods papers discussing this and how to solve it have not yielded anything productive.
Iād very much appreciate any suggestions for readings, or search terms/key words, or alternate code approaches.
- Operating System: Win 10 pro
- brms Version: 2.16.1