Exp-Gamma mixture model

dariato · October 1, 2020, 7:24pm

Dear all, I’m a new entry of probabilistic programming community and I’m trying to fit in a proper way a weird distribution of data that looks like this:

Unknown-10

Because of this “bimodal” and strictly positive shape, I thought to use a mixture of an exponential for the left peak and a gamma for the right one. Specifically, my stan code is the following:

data {
        int<lower=0> N;
        real<lower=0> y[N]; 
    }

    parameters {
        real<lower=0> phi;
        real<lower=0> mu;
        real<lower=0> lambda; \\ rate of exponential distribution
        real<lower=0, upper=1> theta; \\ param of the mixture
    }

    transformed parameters {
        real<lower=0> alpha; \\ shape of gamma distribution
        real<lower=0> beta; \\ rate of gamma distribution
        
        \\ reparametrization of alpha and beta, parameters of gamma distribution
        alpha <- mu .* mu / phi;
        beta <- mu / phi;
    }

    model {
    
        mu ~ normal(10, 10); 
        phi ~ normal(10, 10);
        lambda ~ normal(10, 10); 
        theta ~ beta(1, 1);
        for (n in 1:N) {
            target += log_sum_exp(
                    log(theta) + exponential_lpdf(y[n] | lambda),
                    log1m(theta) + gamma_lpdf(y[n] | alpha, beta));
                }
    }
'''

Unfortunately, the inference was not very good. Rhats of every parameter indicate that the chains very likely have not mixed, so I guess that it is a conceptual problem. Someone can help me and indicate me what is not ok with this model? Thank you in advance!

spinkney · October 2, 2020, 10:11am

Try a soft ordering of the exponential and gamma by using stronger priors so that the mean of the exponential is approximately near 2-3 (lambda ~ normal(0.5, 1)) and the mean of the gamma is near 15-18.

Just from a visual inspection I’d try beta near 0.5 and alpha around 8-9 so
using your parameterization mu ~ normal(15, 2) and phi ~ normal(30, 3).

Also remove the <- - those are deprecated - and remove the elementwise multiplication. It’s also a bit more efficient to assign directly to alpha and beta in the declaration.

  transformed parameters {
        real<lower=0> alpha =  square(mu) / phi; \\ shape of gamma distribution
        real<lower=0> beta = mu / phi; \\ rate of gamma distribution
    }

In addition, check out Michael Betancourt’s case study on identifying mixtures https://mc-stan.org/users/documentation/case-studies/identifying_mixture_models.html.

dariato · October 2, 2020, 1:20pm

Thank you spinkney for your kind answer. First of all, I am not able to make an assignment to alpha and beta in the declaration, it returns me the following error

libc++abi.dylib: terminating with uncaught exception of type std::invalid_argument

Anyway, I think that this is a secondary detail. I tried to follow your suggestions for priors, but chain mixing does not seem to improve. Here I report the fit summary table:

	mean	se_mean	sd	2.5%	25%	50%	75%	97.5%	n_eff	Rhat
phi	27.313887	11.901104	16.866117	0.002500	9.786976	34.590843	41.179883	43.488963	2.008426	14.974326
mu	11.074136	4.489970	6.351390	0.069233	6.359057	14.459533	14.865302	15.485589	2.001017	53.741367
lambda	0.654969	0.396614	0.566157	0.073242	0.087720	0.474994	1.212746	1.415739	2.037693	7.031650
theta	0.390532	0.245998	0.348067	0.108433	0.119606	0.215776	0.708631	0.975922	2.001998	33.756432
alpha	5.126093	1.781273	2.534903	1.212825	3.970319	5.075746	6.611532	9.330145	2.025171	8.585616
beta	5.029049	5.719404	8.716314	0.333556	0.351200	0.434583	2.844751	28.439209	2.322545	2.623867
lp__	-9148.808401	232.751151	329.195081	-9718.405121	-9400.218774	-8974.716057	-8934.799591	-8933.446004	2.000428	252.099822

As you can see it is pretty depressing. If you are so kind as to try playing with those data, I leave them here: data.csv (60.9 KB)

I also tried to use different distributions to in the mixture, but little changes. Thank you again!

spinkney · October 2, 2020, 2:27pm

Even with pretty strong priors this model shows bimodality

If you pick one of the modes and constrain the parameters to be inside it’ll work. Though I’m not sure it’s “satisfying” since you’re basically saying what it should be.

Anyway, here’s the stan code

data {
  int<lower=0> N;
  real<lower=0> y[N]; 
}
parameters {
  real<lower=7> alpha;
  real<lower=0, upper=1> beta;
  real<lower=0, upper=0.3> lambda; // rate of exponential distribution
  real<lower=0, upper=1> theta; // param of the mixture
}
model {
  alpha ~ normal(8, 0.1); 
  beta ~ normal(0.5, 0.1);
  lambda ~ normal(0.1, 0.3); 
  theta ~ beta(1, 1);
  
  for (n in 1:N) {
    target += log_sum_exp(
                  log(theta) + exponential_lpdf(y[n] | lambda),
                  log1m(theta) + gamma_lpdf(y[n] | alpha, beta));
    }
}

and the fit

Inference for Stan model: forum_mix.
4 chains, each with iter=1000; warmup=500; thin=1; 
post-warmup draws per chain=500, total post-warmup draws=2000.

           mean se_mean   sd     2.5%      25%      50%      75%    97.5% n_eff Rhat
alpha      8.04    0.00 0.10     7.86     7.98     8.04     8.11     8.23  1104    1
beta       0.53    0.00 0.01     0.51     0.52     0.53     0.53     0.54  1043    1
lambda     0.12    0.00 0.01     0.11     0.12     0.12     0.12     0.14   804    1
theta      0.34    0.00 0.02     0.31     0.33     0.34     0.35     0.37  1003    1
lp__   -9017.67    0.05 1.46 -9021.60 -9018.31 -9017.37 -9016.63 -9015.87   747    1

Samples were drawn using NUTS(diag_e) at Fri Oct 02 10:25:14 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

and the pairs plot now

dariato · October 2, 2020, 3:43pm

I appreciate you help a lot! This is a good starting point, I will try to relax the priors in order to make the model more flexible, but at least now I understood how to reason in presence of bimodal distributions. Thank you again!

David_Roodman · January 26, 2022, 8:10pm

Reminds me of the cross-sections of the Cox-Ingersoll-Ross diffusion. See figures 4 and 5 at https://www.openphilanthropy.org/sites/default/files/Modeling-the-human-trajectory.pdf#page=47. In figure 5, middle row, right column, vertical cross-sections look rather like your distribution.

Topic		Replies	Views
Mixture formula of discrete and continuous distributions Modeling	9	1331	July 20, 2018
Gamma distribution - parameterization -- gamma distribution General	1	2737	December 21, 2017
Fitting a mixture of log-logistic distributions to interval data Modeling	7	1299	April 3, 2018
Zero-inflated prior or mixture distreibution for prior in Stan General	5	704	April 22, 2022
Parameter recovery issues with normal mixture model Modeling hierarchical-model , mixture	3	638	June 16, 2022

Exp-Gamma mixture model

Related topics