Dirichlet Regression Priors and Convergence Issues


I’m trying to implement a few Dirichlet regressions with up to 7 shares as the response variable, and 1 continuous predictor variable. However, the models take a long time (45 minutes for low numbers of iterations and chains, up to infeasibly long) and do not converge.

I’m thinking of setting very specific priors based on the same models done in a frequentist approach, but I notice that brms sets the first set of coefficients (mu1) to 0 as a reference. This means that I can not set the same priors as the frequentist coefficients, correct? Anyway around this?

And any other ways to improve convergence? Can send the data through email,

  • Operating System: windows 10
  • brms Version: 2.9.6

I am a little bit confused. How to you set priors on frequentist coefficients?

brms used the canoncical parameterization for dirichlet models (i.e. a softmax / multivariate logit link) which fixes the first coeffcieints to zero for identification. Of course, this decision is not without implications, but for a start you can simply set some normal(0, 5) priors in all the regression coefficients and see if that fixes your problem.

Hi Paul,

I have run the Dirichlet regressions in a non-bayesian framework, and as a initial test I wanted to set the priors for the coefficients to the same ones found in those regressions, just to see if convergence is achieved and what the runtime would be; Under Normal priors it either does not converge or takes so much time (and even then the results do not look great).

Ah I see. Does the frequentist version use the same parameterization of brms? Sorry, if you somehow answered that already in your first post. At least it is not clear to me yet.

As I understand it, the frequentist version gives estimates for the coefficients of all variables, while brms sets the first to 0, making my test not possible, unless there is a way to change the parametrization in brms?

I need to see the parameterization of the frequentist version you are using before I can say anthing specific. But I doubt that it just estimates the coefficients of all categories. If at all, there is another restriction happening to identify the model, for instance a fix of the shape parameter.

Ok, I realized I can re-parametrize the frequentist model more easily, I’ll try that and try again.

But I guess a more general question is, is the Dirichlet a particularly unstable distribution just by nature, making convergence difficult? (I read somewhere that the beta distribution is already relatively unstable in Stan)…

I am actually surprised about the convergence problems in your apparently simple model. Two things you could check. (1) Scale your predictor(s) to roughly a unit scale (i.e., standardize or similar) and (2) check if some of the response probabilities are very close to 0 or 1. This could be a source of unstability as the gradients do not exist at 0 or 1 and may be hard compute for values very close it it.

Hmm the latter could be it, there are quite a few rounded zero’s that got filled in with very small units…