The RNGs are easy to define as you don’t need to worry about changes of variables.
vector logdirichlet_rng(vector alpha) {
return log(dirichlet_rng(alpha));
}
vector multinomial_log_rng(vector log_theta) {
return multinomial_rng(exp(log_theta));
}
If those wind up not being stable, you can do the same kinds of tricks, but you probably don’t care so much about overflow/underflow here as the results aren’t going to feed into anything else. If you need more stability in the logdirichlet_rng
, you can probably code it via gammas and then convert everything to the log scale.
But I just realized after trying to define a logdirichlet
distribution that we can’t easily constrain the log of the simplex to be such that its exponentiation sums to 1. Let’s see what happens after working through the math.
For the N-dimensional Dirichlet on the probability simplex scale, we have for \theta \in \Delta^{N-1} (a simplex in N coordinates) and \alpha \in (0, \infty)^N,
\displaystyle \textrm{Dirichlet}(\theta \mid \alpha) = \frac{\Gamma(\text{sum}(\alpha))}{\text{sum}(\Gamma(\alpha))} \cdot \prod_{n=1}^N \theta_i^{\ \alpha_i - 1}.
Now if we want to do a change-of-variables, we start with our simplex variable \Theta and then define a new random variable Z = \log \Theta, we can use the change-of-variables formula to work out the density of Z using the inverse of \log(), namely \exp(), as
\displaystyle p_Z(z \mid \alpha) = \text{Dirichlet}(\exp(z) \mid \alpha) \cdot \left| J_{\exp}(z) \right|,
where J_{\exp(z)} is the derivative matrix of the inverse transform f(\theta) = \exp(\theta) and the bars are absolute determinant operations. Here we have to constrain z such that \exp(z) \in \Delta^{N-1} (i.e., it’s a simplex). But this part’s going to be hard to implement. Let’s put that aside for a second.
Because the Jacobian is diagonal, this works out to
\displaystyle \left| J_{\exp}(z) \right| = \prod_{n=1}^N \left| \frac{\text{d}}{\text{d} z_n} \exp(z_n)\right| = \prod_{n=1}^N \exp(z_n),
so that on the linear scale, we have
\displaystyle p_Z(z \mid \alpha) = \frac{\Gamma(\text{sum}(\alpha))}{\text{sum}(\Gamma(\alpha))} \cdot \prod_{n=1}^N \exp(z_i)^{\ \alpha_i - 1} \cdot \prod_{n=1}^N \exp(z_n).
Now we can just take a log and get
\displaystyle \log p_Z(z \mid \alpha) = \log \Gamma(\text{sum}(\alpha)) - \log \text{sum}(\Gamma(\alpha)) + \sum_{n=1}^N (\alpha_i - 1) \cdot z_i + \sum_{n=1}^N z_i.
We can now cancel the z_i to be left with this:
\displaystyle \log p_Z(z \mid \alpha) = \log \Gamma(\text{sum}(\alpha)) - \log \text{sum}(\Gamma(\alpha)) + \sum_{n=1}^N \alpha_i \cdot z_i.
I tried this in the following Stan program, which does not work. It will define the right density function, but not the right constrained parameter. Also, it looks like you had a stray -log_theta[N]
term in your density definition (or I messed up some algebra somewhere). Working out the correct density in Stan code, you get the following, but it won’t sample properly.
INCORRECT STAN PROGRAM
functions {
real logdirichlet_lpdf(vector log_theta, vector alpha) {
return dot_product(alpha, log_theta)
+ lgamma(sum(alpha)) - sum(lgamma(alpha));
}
}
data {
int<lower=0> N;
vector<lower=0>[N] alpha;
}
parameters {
vector<upper=0>[N] log_theta; // UPPER = 0 NOT SUFFICIENT!!!
}
model {
log_theta ~ logdirichlet(alpha);
}
generated quantities {
vector[N] theta = exp(log_theta); // should be a simplex
real sum_theta = exp(log_sum_exp(log_theta)); // should be 0 every iteration
}
This isn’t going to work because we haven’t constrained log_theta
so that sum(exp(log_theta)) = 1
as we would get if log_theta
is the logarithm of a simplex. To make this work, you’d need to work out an unconstraining transform for a log simplex. It could probably be done, but it’s a lot of work. You need to take one of the simplex transforms, such as our stick-breaking procedure, and keep the whole thing on the log scale and work out the transformed Jacobian adjustments.