I am running a phylogenetic mixed effect model in brms. Currently, I have been accounting for phylogenetic non-independence by supplying brms a covariance matrix created from a consensus tree. However, I have recently come across the use of a prior distribution of trees to further reduce phylogenetic uncertainty (de Villemereuil 2012).
Does anyone know of a way to implement this in brms?
Here is a discourse conversation on that exact topic. I put some code at the bottom of that post explaining how I ran models over a distribution of trees. Note that some of the model structure has recently changed (no
cov_ranef anymore). Basically I made a list of trees and then looped the model over them, then used
combine_models() to merge them. However, brms now has the
brm_multiple() command which I personally have yet to use but I believe is created for this exact purpose. Here is the GitHub issue explaining it.
Running models on subset of the posterior distribution of phylogenetic trees is common in my field, but there may be some theoretical issues with this practice. The first discourse link I posted has some really interesting conversation in it that is worth checking out.
Here is another cool paper on this topic that takes a different view, and the authors state that as few as 50 trees from the posterior capture the uncertainty in the model from the uncertainty in the phylogenetic estimation.
@jonnations has covered the practical aspect of this more or less completely, but I just thought I’d bring up one more thing. Using the entire posterior of trees (or an appropriately sized subset of it), won’t reduce phylogenetic uncertainty, if anything the uncertainty will increase. What it’s doing is appropriately carrying uncertainty in the tree structure forward into the parameters of the of the model that are taking the tree as an input, so you get a more accurate measure of the uncertainty in the model parameters.
Thanks for clearing up the terminology, that is nonetheless what I am looking to do
That’s a really helpful reply thanks a lot. The Nakagawa paper is really insightful. I’m currently struggling to get brms_multiple to accept a list of lists of covariance matrices. I guess as you haven’t run the brms_multiple you may not be able to help this, but I’ll post it below just in case you have any more insights.
I am attempting to implement use brm_multiple to sample from the distribution of the estimated variance-covariance matrices, perform the models and join the posterior distribution. However, I am struggling to translate my original brm model into a brm_multiple to do this.
Previously my code (as following vignette(‘brms_phylogenetics’)):
mod1 <- brm(
phen ~ cofactor + (1 | random) + (1 | gr(species_name, cov = A)) ,
data = data,
family = bernoulli(),
data2 = list(A = A),
where data contains phen, cofactor, random and species_name, and A is a variance-covariance matrix from a phylo tree stored in data2.
To run multiple imputations of trees now I have:
phen ~ cofactor + (1 | random) + (1 | gr(species_name, cov = my_tree_list)) ,
data = dataset_list,
family = bernoulli(),
data2 = my_tree_list,
Where my_tree_list is a list of named lists of variance-covariance matrices created from different trees i.e. a list of list(A=A) from mod1 and dataset_lists is a list of the same dataset, length same as data2.
However, this gives an Error: my_tree_list cannot be found in data2
I have also attempted data2 = list(A = list_of_vcvs) where list_of_vcvs is a list of the variance-covariance matrices rather than pre-converted into lists as above. This gives: error “data2” must be a list of named lists.
However, this provides an error that my_tree_list cannot be found in data2 does anyone have any pointers as to what alterations I need to make to data2 or the gr(cov = A) part.
In this latest brms versions (at least 2.14.0++), data2 in brm_multiple will accept (and expect) a list of named lists. That is
list(list(A = A1), list(A = A2))
Hi @paul.buerkner this was really helpful thanks a lot I am now trying to work out how to incorporate a prior for the covariance matrix. Though I am a little stuck. I have read in the literature and seen the recommendations for an LKJ prior on correlation matrices though I do not fully understand how to apply it to a covariance matrix.
I plan to perform a Cholesky decomposition on “A” : A_cor <- chol(A) and then use this correlation matrix in the model and provide a prior for that. Do you have any advice on how to do this? Or, is there a simpler way of providing a prior directly to the covariance matrix i.e. simply providing a prior to the group-level effect “species_name”