I am trying to find a way to implement a Dirichlet regression in stan.
My problem is a multiple version of a beta regression (that didn’t work when the information content of the data is low, being a 2 levels hierarchical model, Regression of proportions)
I might be able eventually to answer my own question, hopefully.
But I leave here some working notes for discussion.
I created a generative process to simulate my probabilities from a dirichlet regression. To solve the indeterminability of angular coefficients I constrained them to be a simplex multiplied by a positive real number
# DIRICHLET
n_samples = 500
## from http://tr.im/hH5A
logsumexp <- function (x) {
y = max(x)
y + log(sum(exp(x - y)))
}
softmax <- function (x) {
exp(x - logsumexp(x))
}
set.seed(123)
phi <- 40;
x <- runif(n_samples);
props <- matrix(rep(NA, n_samples*2), ncol=2);
X = model.matrix(~cov , data = data.frame(cov=x))
coef_ang = c(0.95, 0.05)
kappa = 5
beta = matrix(c(1, 5, coef_ang*kappa), ncol=2, byrow=T)
y_hat = X %*% beta
for (n in 1:n_samples) y_hat[n,] = softmax( y_hat[n,])
for (n in 1:n_samples) props[n,] = rdirichlet(1,y_hat[n,] * phi)
plot(X[,2],props[,1], ylim=c(0,1))
points(X[,2],props[,2], col='red')
Did you thought about a Wishart distribution as multivariate extension of a gamma distribution?
Or if you don’t want to model correlations between the Ys go with gamma distributions and predefine
the variances the each gamma, similar the alphas prior in the Dirichlet distribution.
Hint: The Dirichlet distribution can be defined as ratio of gamma distributions. Since you don’t need
the normalization of the Dirichlet and later scale it, the normalization becomes obsolete, I feel not so
ok with it.
yes I agree that multiplication by factors before a softmax sounds a bit strange, in fact I am trying to eliminate them when I can. I will give update on the model soon.
About the Gamma alternative, I have to research a bit, as it is not clear to me the conversion between Dirichlet framework and Gamma.
In your case you might omit the normalization, thus independent gammas
with gamma(mu*sigma, sigma) could be used to fit your data. If you struggle
with the gamma, read the mean and variance part in Wikipedia. The mean and
variance just needs this transformation and confusingly the gamma has two
forms.
How come something quite common such as softmax/dirichlet regression has never been encountered/established in stan-like languages?
softmax is implemented in Stan and used by some models in Stan. I have no time to do a
research, but I used it myself. Dirichlet regression I have never seen.
BTW, there are interpretations of exponential, gamma, weibull, lognormal distributions and connections
eg. the exponential and the poisson. Normally we define a problem and then we think about, but you came
with the tool first and then ask how you can use it, like how I can nail that screw with my hammer.
Sorry, if I sound judging. Just my 2 cents.
Thanks for sharing. I really appreciate papers that lay out the model they’re using cleanly and then include code.
I don’t think rstanarm supports Dirichlet regression yet, though I know they did a bunch of work trying to get beta regression to work, which must have similar issues (but probably not the same identifiability issue, as there are only two categories).