Phylogenetic control—how to deal with multiple trees (no consensus tree)

I’m about to do a regression where I’ll need to control for a phylogeny. The regression is between bird species, and the tree that everyone uses for phylogenetic control is birdtree.org. For this tree, you don’t download one tree, you download many—they don’t provide a consensus tree. If you want to make one, that’s your prerogative, but they recommend doing your analysis over the multiple trees they provide.

If I’m trying to do thousands of iterations, and I have to do those thousands of iterations thousands of times to deal with the thousands of phylogenetic trees, I think I’ll cry—I’m sure there are better ways of doing this.

How do people usually deal with this? Is there a way to use 1 tree for each iteration in each chain?

Thank you!

P.S. I don’t want to get bogged down with making and fixing fighting an OU model—how bad would it be if I just used a static Brownian motion covariance matrix? If I actually really properly should us an OU process I’ll do it! But if it’s honestly justifiable to not, I might just not… (Too much pain working with them in the past.)

Honestly I’d just use a consensus tree. Pulido-Santacruz & Weir updated the Jetz et al phylogeny by appending the Derryberry et al phylogeny of the Furnariidae, and distribute both posterior samples and max clade credibility tree via their supplemental info.

Maybe when it was time to run analyses one final time I’d run them over 20 or 50 of the posterior sample trees and make sure they remain consistent.

https://onlinelibrary.wiley.com/doi/abs/10.1111/evo.12899

1 Like

Hi again Jacob! I’m honoured by two answers in the same day!

That makes sense, thank you. I’ll use the Pulido-Santacruz & Weird tree, and do as you suggested when my analyses are done.

1 Like