Multivariate analysis

Hi everybody,

I am new to brms, and bayesian statistics. I would like to dig a bit deeper though, and I tried applying it to my own data.

I am running a multivariate model. I have 3 metrics, which I follow over Time (factor), and which I want to compare between 2 Groups (factor).
The metrics are heavily correlated, and the distributions are definitely not normal (rather skewed). It is not an option to apply a transformation on the data.

I found in the online material how to run a multivariate model in brms (here: Estimating Multivariate Models with brms). However, this approach doesn’t support non normal underlying distributions.

Would it be an option to construct the formula like this?

Form1 ← bf(Response1 ~ 1 + Group * Time + (1|Participant)) + skew_normal()
Form2 ← bf(Response2 ~ 1 + Group * Time + (1|Participant)) + skew_normal()
Form3 ← bf(Response3 ~ 1 + Group * Time + (1|Participant)) + skew_normal()

ModelFormula ← Form1 + Form2 + Form3

But how do I account for the fact that the data are correlated? And how can I correct for multiple comparisons?

Thanks for your reply!
A

This will fit a multivariate model, but it won’t include residual correlations between the responses (rescor = FALSE) because brms doesn’t have a multivariate skew-normal.

Depending on your data and goals, you might be able to exploit correlations in the effects of Participant by specifying (1 | p | Participant). The letter p doesn’t mean anything special, it’s just used as a grouping marker that let’s brms know which group-level effects should be correlated. I think the argument here might be that participants have consistent, correlated responses to 1-3 that mean that they differ in the set of intercepts their specific time series will have.

I’m not sure where the multiple comparisons issue would arise with your problem. Andrew Gelman has a blog post that summarizes thinking on multiple comparisons in Bayesian analyses.

1 Like

As a half-measure, you can allow correlations in your upper-level deviation terms with syntax like this:

Form1 <- bf(Response1 ~ 1 + Group * Time + (1 |p| Participant)) + skew_normal()
Form2 <- bf(Response2 ~ 1 + Group * Time + (1 |p| Participant)) + skew_normal()
Form3 <- bf(Response3 ~ 1 + Group * Time + (1 |p| Participant)) + skew_normal()

Notice the double || with some key in between. p does not need to be in your data, and you can use other characters like X or whatever. The trick is that by placing that p in all 3 of your formulas, you will allow the random intercepts for Response1, Response2, and Response3 to correlate. You can learn more about this in the brmsformula section of the brms reference manual (link).