Dirichlet regression

stemangiola · December 13, 2017, 7:49am

Hello Community,

I am trying to find a way to implement a Dirichlet regression in stan.

My problem is a multiple version of a beta regression (that didn’t work when the information content of the data is low, being a 2 levels hierarchical model, Regression of proportions)

I went through some material

but I could not implement anything concrete yet

Could someone give a dummy possible model as example, or point to some more clear direction?

Thanks a lot.

stemangiola · December 13, 2017, 9:05pm

A resource I am finding helpful: http://epub.wu.ac.at/4077/1/Report125.pdf

I found this discussion of Bob some time ago that also tries to deal with a hierarchical dirichlet regression.

https://groups.google.com/forum/#!topic/stan-users/3Gh1qvfoogY

In particular deals with the intercept/angular coefficients as such:

 parameters {
     vector[C] coef;
     real<lower=0> prior_count;
   }
   model {
     vector<lower=0>[N] alphavec;
     alphavec <- prior_count * softmax(xm * coef);
     prior_count ~ exp(1);
     ...
   }

If I understand well:

shouldn’t coef be a matrix (number_of_covariates / number_of_components_of_the_dirichlet)?
If we softmax, then coef would not need to be a constrained variable to avoid indeterminability?

For example if I have two time points where nothing has changed

T0: Dirichlet(softmax([2, 2, 2, 2, 2) * 40)
T1: Dirichlet(softmax([2 + 1 * coef[1], 2+ 1 * coef[2], 2+ 1 * coef[3], 2+ 1 * coef[4], 2+ 1 * coef[5]) * 40)

Then coef could be any (single) number and will be unconstrained. Is this correct? Would a 0 centered prior be enough?

Shouldn’t a logit link function be used on xm * coef , similarly to the beta regression?

Thanks a lot!

stemangiola · December 14, 2017, 2:52am

I might be able eventually to answer my own question, hopefully.

But I leave here some working notes for discussion.

I created a generative process to simulate my probabilities from a dirichlet regression. To solve the indeterminability of angular coefficients I constrained them to be a simplex multiplied by a positive real number

# DIRICHLET

n_samples = 500

## from http://tr.im/hH5A
logsumexp <- function (x) {
	y = max(x)
	y + log(sum(exp(x - y)))
}
softmax <- function (x) {
	exp(x - logsumexp(x))
}
set.seed(123)

phi <- 40;
x <- runif(n_samples);
props <- matrix(rep(NA, n_samples*2), ncol=2);

X = model.matrix(~cov , data = data.frame(cov=x))

coef_ang = c(0.95, 0.05)
kappa = 5
beta = matrix(c(1, 5, coef_ang*kappa), ncol=2, byrow=T)
	
y_hat = X %*% beta
for (n in 1:n_samples)  y_hat[n,] =  softmax( y_hat[n,])
for (n in 1:n_samples) props[n,] = rdirichlet(1,y_hat[n,] * phi)

plot(X[,2],props[,1], ylim=c(0,1))
points(X[,2],props[,2], col='red')

I will try to compile a stan model soon

andre.pfeuffer · December 15, 2017, 4:05am

You named the tool already, Dirichlet regression. I’m with Bob: I’ve never seen a Dirichlet regression like this, so take the following with a grain of salt. "
https://groups.google.com/forum/#!topic/stan-users/3Gh1qvfoogY

Did you thought about a Wishart distribution as multivariate extension of a gamma distribution?
Or if you don’t want to model correlations between the Ys go with gamma distributions and predefine
the variances the each gamma, similar the alphas prior in the Dirichlet distribution.
Hint: The Dirichlet distribution can be defined as ratio of gamma distributions. Since you don’t need
the normalization of the Dirichlet and later scale it, the normalization becomes obsolete, I feel not so
ok with it.

stemangiola · December 17, 2017, 10:27pm

Thanks Andre,

yes I agree that multiplication by factors before a softmax sounds a bit strange, in fact I am trying to eliminate them when I can. I will give update on the model soon.

About the Gamma alternative, I have to research a bit, as it is not clear to me the conversion between Dirichlet framework and Gamma.

andre.pfeuffer · December 18, 2017, 1:45am

In section Random number generation its noted how to simulate a Dirichlet distribution
from gamma distributions.
https://en.wikipedia.org/wiki/Dirichlet_distribution#Gamma_distribution

A good practical paper about multinominal and Dirichlet using JAGS can be
found here:
http://www3.jouy.inra.fr/miaj/public/nosdoc/rap2012-5.pdf

In your case you might omit the normalization, thus independent gammas
with gamma(mu*sigma, sigma) could be used to fit your data. If you struggle
with the gamma, read the mean and variance part in Wikipedia. The mean and
variance just needs this transformation and confusingly the gamma has two
forms.

stemangiola · December 19, 2017, 7:27am

Thanks I will approach it

stemangiola · December 19, 2017, 9:45am

In the meanwhile, I have two questions.

Why the gamma representation should be better than the “beta regression”

How come something quite common such as softmax/dirichlet regression has never been encountered/established in stan-like languages?

andre.pfeuffer · December 19, 2017, 11:52am

beta “limits” you to [0,1], its based upon gamma distribution.
See for understanding the beta distribution:

http://varianceexplained.org/r/beta_binomial_baseball/

How come something quite common such as softmax/dirichlet regression has never been encountered/established in stan-like languages?

softmax is implemented in Stan and used by some models in Stan. I have no time to do a
research, but I used it myself. Dirichlet regression I have never seen.

BTW, there are interpretations of exponential, gamma, weibull, lognormal distributions and connections
eg. the exponential and the poisson. Normally we define a problem and then we think about, but you came
with the tool first and then ask how you can use it, like how I can nail that screw with my hammer.
Sorry, if I sound judging. Just my 2 cents.

https://cran.revolutionanalytics.com/web/packages/betareg/vignettes/betareg.pdf

stemangiola · December 22, 2017, 6:34am

Hello,

please have a look to an attempt to approach the problem in form of a R package testable for feedback

increasechief · August 21, 2018, 2:50pm

You may find this paper helpful [ Bayesian Regression for a Dirichlet Distributed Response using Stan]:

https://arxiv.org/abs/1808.06399

Bob_Carpenter · August 22, 2018, 9:07am

Thanks for sharing. I really appreciate papers that lay out the model they’re using cleanly and then include code.

I don’t think rstanarm supports Dirichlet regression yet, though I know they did a bunch of work trying to get beta regression to work, which must have similar issues (but probably not the same identifiability issue, as there are only two categories).

Topic		Replies	Views
Regression with parameter constraints Modeling	3	1603	November 2, 2017
Dirichlet Regression using either the Common or Alternative Parameterization Modeling rstan , specification , brms	11	2142	April 30, 2021
Implementing Logit-Normal priors on Dirichlet-Multinomial model Modeling	1	762	January 3, 2022
Help expanding a Dirichlet-Categorical model Modeling rstan , specification , ecology , hierarchical-model , dirichlet-multinomial	0	395	September 10, 2023
Dirichlet distribution example not working Modeling	3	684	June 28, 2022

Dirichlet regression

Related topics