Phylogenetic Dirichlet regression

Hi everyone,

I was wondering whether there are any examples showing how to implement a Phylogenetic Dirichlet regression.
I’m interested in modelling a compositional response (i.e., proportional data, in which more than two response variables are expressed as percentages or fractions of a whole) taking into account phylogeny.

I’m aware of the excellent phylogenetic vignette but I haven’t been able to find any Dirichlet regression examples using phylogenetically structured data.

For instance, using the 1st example from the phylogenetic vignette, we have that a simple phylogenetic model can be built in the following way:

model_simple <- brm(
  phen ~ cofactor + (1|gr(phylo, cov = A)),
  data = data_simple,
  family = gaussian(),
  data2 = list(A = A),
  prior = c(
    prior(normal(0, 10), "b"),
    prior(normal(0, 50), "Intercept"),
    prior(student_t(3, 0, 20), "sd"),
    prior(student_t(3, 0, 20), "sigma")
  )
)

So if the response variable were compositional data we could perhaps implemented doing something like this (but I don’t know if I’m missing something):

# simulate  compositional data for this example
N <- dim(data_simple)[1]
bind <- function(...) cbind(...)
data <- data.frame(
  y1 = rbinom(N, 10, 0.5), y2 = rbinom(N, 10, 0.7), 
  y3 = rbinom(N, 10, 0.9), x = rnorm(N)
) %>%
  mutate(
    size = y1 + y2 + y3,
    y1 = y1 / size,
    y2 = y2 / size,
    y3 = y3 / size
  )
# add to df
data_simple$y <- with(data, cbind(y1, y2, y3))

# run dirichlet model
dirichlet_model <- brm( y ~ cofactor + (1|gr(phylo, cov = A)), data=data_simple, dirichlet(), data2 = list(A = A))

Furthermore, it would be also ideal to see how to implemente a Dirichlet phylogenetic model with repeated measurements too, as I’m not sure if the correct structure would be something like this:

# load data
data_repeat <- read.table(
  "https://paul-buerkner.github.io/data/data_repeat.txt",
  header = TRUE
)

# simulate  compositional data for this example
N <- dim(data_repeat)[1]
bind <- function(...) cbind(...)
data <- data.frame(
  y1 = rbinom(N, 10, 0.5), y2 = rbinom(N, 10, 0.7), 
  y3 = rbinom(N, 10, 0.9), x = rnorm(N)
) %>%
  mutate(
    size = y1 + y2 + y3,
    y1 = y1 / size,
    y2 = y2 / size,
    y3 = y3 / size
  )
# add to df
data_repeat$y <- with(data, cbind(y1, y2, y3))

model_repeat_dirichlet <- brm(
  y ~ cofactor + (1|gr(phylo, cov = A)) + (1|species),
  data = data_repeat,
  family = dirichlet(),
  data2 = list(A = A))

I would be super greatful if somebody could provide some proper examples of how to do this in brms and the parameterisation used, as I’m new to brms and I’m don’t feel completely confident yet.

Thank you for your help,

Tommy

  • Operating System: macOS Monterey 12.3.1
  • brms Version: 2.16.3

Hi Tommy,

Did you try your models yet? I can’t really comment on Dirichlet Regression in particular, but you should be able to add the phylogenetic correlation (or covariance) matrix into basically any brms model as a group-level effect using the same notation (the data2 = list(...) part) as a gaussian or poisson regression (both are in that vignette I think). I have used the phylo matrix in ordinal models, binomial models, etc and everything works as it should! HERE is a manuscript that uses the brms phylo models in categorical regressions.

The same thing goes for the intra-specific variation part. It’s just an additional group-level effect.

One thing to consider is that if you are interested in the Pagel’s lambda value, that may not be an option in non-gaussian regression (or regressions that don’t include a sigma variable). There is some discussion of that here in this post.

Good luck!

[edit: spelling]

Thanks for your reply and advice (as well as for the linked paper); it’s really helpful . I’ll give it a go with my own dataset soon and post the results here (as they may be useful for other community members).

Cheers,

Tommy