Hello! I’ve been using brms a bit to model the effect of different predictors on the abundance of different microbial taxa. Previously, I was using brms_multiple to fit the counts of each different taxa (eg. Proteobacteria, Firmicutes) to the same formula (eg. totalabundance ~ Age + (1 | random_eff)). This approach treats the abundance of Proteobacteria, for example, independent to the abundance of all other taxa in the community. Since this is unrealistic in the actual microbial community, I’m wondering if this would be an instance where a multivariate model would be appropriate. My understanding is that in a multivariate model, the response variables are allowed to correlate with each other. Is this correct? Also, do these models have fitting issues you have many response variables? For some taxonomic classifications, I’m interested in modelling the counts of 60+ different taxa.
An example of how I’m structuring the models:
### set up model with many columns (names stored in tax_names) in mvbind
bform1.a <- eval(parse(text = paste0(
"bf(mvbind(",
paste(paste0(tax_names), collapse = ", "),
") ~ Age + (1 | Clutch))"
)))
brm.da.phy1.a <-
brm(
bform1.a,
data = ps.wide.wmeta,
family = negbinomial(),
chains = 4,
cores = 4,
iter = 4000,
backend = "cmdstanr",
control = list(adapt_delta = 0.9),
file = "output/brms_models_pre_loo_mvbind/brm.da.phy1.a",
file_refit = "on_change",
refresh = 0
)
Thanks!