Dirichlet regresion using brms

Please also provide the following information in addition to your question:

  • Operating System: linux
  • brms Version: 2.8.0

Dear brms users

I understand that brms is now able to do dirichlet regression, but I fail to find examples of how to do it in the package documentation. I would like to use brms instead of the DirichletReg package that I’m using now. It appears to no longer be maintained.

I would be greatful, if somebody could provide an example of how to do this in brms and the parameterization used.

Thank you for your help

Jannik

The handling is the same as for family categorical except that now your response variable should be a matrix. For instance

library(dplyr)
bind <- function(...) cbind(...)

N <- 20
df <- data.frame(
  y1 = rbinom(N, 10, 0.5), y2 = rbinom(N, 10, 0.7), 
  y3 = rbinom(N, 10, 0.9), x = rnorm(N)
) %>%
  mutate(
    size = y1 + y2 + y3,
    y1 = y1 / size,
    y2 = y2 / size,
    y3 = y3 / size
  )
df$y <- with(df, cbind(y1, y2, y3))

make_stancode(bind(y1, y2, y3) ~ x, df, Dirichlet())
make_standata(bind(y1, y2, y3) ~ x, df, Dirichlet())

fit <- brm(bind(y1, y2, y3) ~ x, df, Dirichlet())
summary(fit)

I avoid using cbind because it is currently reseverd in brms for the purpose of multivariate models.

1 Like

Thanks Paul.

I just tried it on one of my datasets, and it works like a dream :-)

In DirichletReg it is possible to specify different models for each component in the proportion, is something similar possible in brms? And If I wanted a more robust regression, would it make sense to use a horseshoe prior or something similar?

In the documentation phi appears to have a log link, but in the model summary it states that the identity link was used, this even if I set dirichlet(…, link_phi = “log”), is this an error?

Thank you

/Jannik

See vignette(brms_distreg) for examples of the distributional regression framework in brms. In dirichlet models, the parameters are called mu<category name>. You will also see them appearing as prefixes in the summary output.

phi is not predicted in your model and as such no link function needs to be applied.

Thanks paul for taking your time with this.

I’m still somewhat confused.

  1. Paul, when you write, phi is not “estimated in your model” what do you mean?

I’m using the following call to brms to do my model, where proportion is a matrix:

frml <- bf(proportions ~ .)

fit <- brm(frml,
data = R80_data,
family = dirichlet(link = ‘logit’, link_phi = ‘log’),
future = TRUE)

After fitting the summary of fit provides the following output (snippets):

Family: dirichlet
Links: muR11911 = logit; muR30725 = logit; muR30993 = logit; muR31035 = logit; muR31038 = logit; muR31047 = logit; muR31054 = logit; muR31061 = logit; muR31068 = logit; muR31071 = logit; muR31080 = logit; muR31088 = logit; muR31093 = logit; muR31468 = logit; phi = identity
Formula: proportions ~ C11471 + C11911 + C30725 + C30993 + C31035 + C31038 + C31047 + C31054 + C31061 + C31068 + C31071 + C31080 + C31088 + C31093 + C31468
Data: R80_data (Number of observations: 89)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000

The header shows that phi has the identity link (eventhough I specified a log link).

In the summary, apart from estimates of mu’s, I also get an estimate for phi:

Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
phi 216.88 11.10 195.20 238.98 2932 1.00

Which is why I do not understand your statement tha phi is not ‘predicted’?

  1. I understand that it is possible to specify “predictor terms for all parameters of the assumed response distribution”. In the case of dirichlet, this would be mu and phi, correct?
    However, does this also mean that for a three component proportion, with components A, B, and C, with predictors x, y, z I would be able to model muA ~ x, muB ~ x + y, muC ~ x + y + z, which is what I’m seeking to do (this is possible in the DirichletReg package) ?

  2. If it is not possible to model each mu seperately would it be possible to use some sort of regularization, such as the horseshoe prior? Do I need to take special care when specifying the prior when the mu link is logit?

Again, thank you for an amazing package and all your help.

/Jannik

With “phi is not predicted” I mean that you didn’t specify a linear predictor for phi so there is no need to apply any link function so brms just says “identity”. That’s it. It works the same for auxiliary parameters of all other families.

2.) Yes, exactly. Except that the first category is chosen as the reference by default for reasons of identification.

Now it makes sense to me. No prediction = no link.

Thanks!

Hi Paul
Is there a simple extension of this model to a hierarchical model. If y1, y2, and y3 were observed across different settings (for example, in a meta-analysis), would the following extension be appropriate?
fit ← brm(bind(y1, y2, y3) ~ ~ 1 + x + (1 | Setting) , df, Dirichlet())
summary(fit)
Regards
Ross