I am working towards a larger model that contains a Dirichlet-Multinomial distribution, and I wanted to investigate if it is faster to use the analytical form of the distribution that integrates out individual probabilities, henceforth denoted “implicit” (imp) code, or explicitly (exp) include the individual probabilities, denoted {\bf{r}}. That is, either

P\left( {{\bf{X}}|{\rm{\alpha }}} \right) = {{\Gamma \left( {\sum {{\alpha _k}} } \right)\Gamma \left( {N + 1} \right)} \over {\Gamma \left( {N + \sum {{\alpha _k}} } \right)}}\prod\limits_{k = 1}^m {{{\Gamma ({X_k} + {\alpha _k})} \over {\Gamma ({\alpha _k})\Gamma \left( {{X_k} + 1} \right)}}}

or

\eqalign{ & {\bf{X}} \sim \mathrm{Multinomial}\left( {\bf{r}} \right) \cr & {\bf{r}} \sim \mathrm{Dirichlet}\left( {\rm{\alpha }} \right) \cr}

Here is what I found in a simulation study with a uniform Dirichlet process, i.e. identical concentration parameters {\alpha _1} = {\alpha _2} = ... = {\alpha _m}. The example is using and m=100 compartments and N=1000 trials. Compilation time is not included.

The y axis shows computation time per effective sample (timeperneff). The conclusions are:

- The explicit approach takes longer for small and large alphas but is faster for intermediate alphas.
- The difference is substantial (notice the log y axis.)
- The implicit approach is quite stable in terms of computation time per effective sample.
- The explicit approach also suffers from divergent transitions (isdev) at low alphas.

Questions:

All and all, it seems like a safer choice to integrate out the individual probabilities, which are not of interest. But it would depend on the data what is the faster approach. Does anyone have experiences from real data? Are there pitfalls ahead?

Is there an intuitive explanation for the non-linear effect of alpha on computational efficiency under the explicit approach?

Is there a way to speed up the calculations of the implicit approach that reduces the number of log-gamma functions that must be evaluated? See implicit code below.

The divergent transitions at low alphas are likely a result of steep cut-offs at zero in the conditional distribution of some r. Is there a straightforward reparameterization that could solve this, given that r is a simplex?

Explicit code:

```
data {
int<lower=0> m;
int<lower=0> X[m];
}
parameters {
real logalpha;
simplex[m] r;
}
transformed parameters{
real<lower=0> alpha=exp(logalpha);
vector<lower=0>[m] alphavec;
for (i in 1:m){
alphavec[i]=alpha;
}
}
model {
X ~ multinomial(r);
r ~ dirichlet(alphavec);
alpha ~ gamma(.01,.01);
target += logalpha;
}
```

Implicit code:

```
data {
int<lower=0> m;
int<lower=0> X[m];
int<lower=0> sumX;
}
parameters {
real logalpha;
}
transformed parameters{
real<lower=0> alpha=exp(logalpha);
real<lower=0> sumalpha;
vector<lower=0>[m] alphavec;
vector<lower=0>[m] xplusaplha;
for (i in 1:m){
alphavec[i]=alpha;
xplusaplha[i]=X[i]+alphavec[i];
}
sumalpha=sum(alphavec);
}
model {
target+= lgamma(sumalpha) -lgamma(sumX+sumalpha) +sum(lgamma(xplusaplha))- sum(lgamma(alphavec)); //- sumlgammaXplus1+ lgamma(sumX+1);
target += logalpha;
alpha ~ gamma(.01,.01);
}
```