I have a problem which involves the hypoexponential distribution, which is the distribution which results from a sum of n exponential random variables, each with different rate parameters.

It is a pretty horrific distribution to deal with (see my related Stackexchange query about this, here) because it’s extremely computationally intensive to evaluate and numerically unstable.

Because of these issues, I don’t want to use this distribution, even though this is the most correct thing to do. I can approximate it using a gamma distribution by analytically matching the theoretical mean and standard deviation but this approximation is really poor for certain parameter values for the hypoexponential (see here).

An alternative would be for me to use a more flexible distributional approximation e.g. approximate it with a mixture of gamma distributions or with a lower order hypoexponential (i.e. a hypoexponential distribution with n'<<n distinct exponential random variables).

The more flexible I want this approximation to be, the less able I am to be able to find analytical results giving me the ‘best’ parameters to approximate the hypoexponential distribution with. So, I would prefer a way to determine appropriate parameter values for this distributional approximation via, say, a form of gradient descent.

So what my Stan code would look like is something like (assuming n=3 for brevity):

```
functions {
real hypoexponential_approximation_lpdf(real X, real lambda_1, real lambda_2, real lambda_3) {
vector theta = minimiser(objective_function, lambda_1, lambda_2, lambda_3);
real log_prob = approximate_lpdf(X | theta);
return(log_prob);
}
}
model {
X ~ hypoexponential_approximation(lambda_1, lambda_2, lambda_3);
}
```

I don’t think such a `minimiser`

exists in Stan but I wanted to double-check?

A second question, perhaps cheekily, is does anyone have any bright ideas for approximating this distribution in Stan?

**Note 1:** I realise that Stan has an algebraic solver, but I don’t think this will work apart from in special circumstances; imagine I want to minimise the sum of squared differences between e.g. moments of the hypoexponential distribution and those of the approximating distribution – this won’t generally be zero but I would like it to be as close to zero as possible.

**Note 2:** the above is very different to say using rstan’s `optimisation`

routine – I am saying that I would like to use minimisation **within** a sampling routine.