Exp-Gamma mixture model

Dear all, I’m a new entry of probabilistic programming community and I’m trying to fit in a proper way a weird distribution of data that looks like this:

Because of this “bimodal” and strictly positive shape, I thought to use a mixture of an exponential for the left peak and a gamma for the right one. Specifically, my stan code is the following:

``````data {
int<lower=0> N;
real<lower=0> y[N];
}

parameters {
real<lower=0> phi;
real<lower=0> mu;
real<lower=0> lambda; \\ rate of exponential distribution
real<lower=0, upper=1> theta; \\ param of the mixture
}

transformed parameters {
real<lower=0> alpha; \\ shape of gamma distribution
real<lower=0> beta; \\ rate of gamma distribution

\\ reparametrization of alpha and beta, parameters of gamma distribution
alpha <- mu .* mu / phi;
beta <- mu / phi;
}

model {

mu ~ normal(10, 10);
phi ~ normal(10, 10);
lambda ~ normal(10, 10);
theta ~ beta(1, 1);
for (n in 1:N) {
target += log_sum_exp(
log(theta) + exponential_lpdf(y[n] | lambda),
log1m(theta) + gamma_lpdf(y[n] | alpha, beta));
}
}
'''
``````

Unfortunately, the inference was not very good. Rhats of every parameter indicate that the chains very likely have not mixed, so I guess that it is a conceptual problem. Someone can help me and indicate me what is not ok with this model? Thank you in advance!

Try a soft ordering of the exponential and gamma by using stronger priors so that the mean of the exponential is approximately near 2-3 (lambda ~ normal(0.5, 1)) and the mean of the gamma is near 15-18.

Just from a visual inspection I’d try beta near 0.5 and alpha around 8-9 so
using your parameterization mu ~ normal(15, 2) and phi ~ normal(30, 3).

Also remove the `<-` - those are deprecated - and remove the elementwise multiplication. It’s also a bit more efficient to assign directly to alpha and beta in the declaration.

``````  transformed parameters {
real<lower=0> alpha =  square(mu) / phi; \\ shape of gamma distribution
real<lower=0> beta = mu / phi; \\ rate of gamma distribution
}
``````

In addition, check out Michael Betancourt’s case study on identifying mixtures https://mc-stan.org/users/documentation/case-studies/identifying_mixture_models.html.

Thank you spinkney for your kind answer. First of all, I am not able to make an assignment to alpha and beta in the declaration, it returns me the following error

libc++abi.dylib: terminating with uncaught exception of type std::invalid_argument

Anyway, I think that this is a secondary detail. I tried to follow your suggestions for priors, but chain mixing does not seem to improve. Here I report the fit summary table:

mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
phi 27.313887 11.901104 16.866117 0.002500 9.786976 34.590843 41.179883 43.488963 2.008426 14.974326
mu 11.074136 4.489970 6.351390 0.069233 6.359057 14.459533 14.865302 15.485589 2.001017 53.741367
lambda 0.654969 0.396614 0.566157 0.073242 0.087720 0.474994 1.212746 1.415739 2.037693 7.031650
theta 0.390532 0.245998 0.348067 0.108433 0.119606 0.215776 0.708631 0.975922 2.001998 33.756432
alpha 5.126093 1.781273 2.534903 1.212825 3.970319 5.075746 6.611532 9.330145 2.025171 8.585616
beta 5.029049 5.719404 8.716314 0.333556 0.351200 0.434583 2.844751 28.439209 2.322545 2.623867
lp__ -9148.808401 232.751151 329.195081 -9718.405121 -9400.218774 -8974.716057 -8934.799591 -8933.446004 2.000428 252.099822

As you can see it is pretty depressing. If you are so kind as to try playing with those data, I leave them here: data.csv (60.9 KB)

I also tried to use different distributions to in the mixture, but little changes. Thank you again!

Even with pretty strong priors this model shows bimodality

If you pick one of the modes and constrain the parameters to be inside it’ll work. Though I’m not sure it’s “satisfying” since you’re basically saying what it should be.

Anyway, here’s the stan code

``````data {
int<lower=0> N;
real<lower=0> y[N];
}
parameters {
real<lower=7> alpha;
real<lower=0, upper=1> beta;
real<lower=0, upper=0.3> lambda; // rate of exponential distribution
real<lower=0, upper=1> theta; // param of the mixture
}
model {
alpha ~ normal(8, 0.1);
beta ~ normal(0.5, 0.1);
lambda ~ normal(0.1, 0.3);
theta ~ beta(1, 1);

for (n in 1:N) {
target += log_sum_exp(
log(theta) + exponential_lpdf(y[n] | lambda),
log1m(theta) + gamma_lpdf(y[n] | alpha, beta));
}
}
``````

and the fit

``````Inference for Stan model: forum_mix.
4 chains, each with iter=1000; warmup=500; thin=1;
post-warmup draws per chain=500, total post-warmup draws=2000.

mean se_mean   sd     2.5%      25%      50%      75%    97.5% n_eff Rhat
alpha      8.04    0.00 0.10     7.86     7.98     8.04     8.11     8.23  1104    1
beta       0.53    0.00 0.01     0.51     0.52     0.53     0.53     0.54  1043    1
lambda     0.12    0.00 0.01     0.11     0.12     0.12     0.12     0.14   804    1
theta      0.34    0.00 0.02     0.31     0.33     0.34     0.35     0.37  1003    1
lp__   -9017.67    0.05 1.46 -9021.60 -9018.31 -9017.37 -9016.63 -9015.87   747    1

Samples were drawn using NUTS(diag_e) at Fri Oct 02 10:25:14 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
``````

and the pairs plot now

1 Like

I appreciate you help a lot! This is a good starting point, I will try to relax the priors in order to make the model more flexible, but at least now I understood how to reason in presence of bimodal distributions. Thank you again!

Reminds me of the cross-sections of the Cox-Ingersoll-Ross diffusion. See figures 4 and 5 at https://www.openphilanthropy.org/sites/default/files/Modeling-the-human-trajectory.pdf#page=47. In figure 5, middle row, right column, vertical cross-sections look rather like your distribution.