2D mixture of truncated distributions

Ezequiel_Alvarez · September 25, 2024, 3:29am

Hi,

I’m trying to implement a mixture of 2-Dimensional truncated distributions. In one class both dimensions correspond to the same Normal, and in the other class both dimensions correspond to the same exponential. As described in the non-working code below

data {
 int <lower=0> N; // number of datapoints
 real L, U; // lower and upper limits of the observables
 array[N,2] real<lower=L, upper=U> y;  // there are 2 observables per datapoint
 real mu0, sigma0, mu1, sigma1, mu2, sigma2, t1, t2;  // hyperparameters for the parameters priors
}

parameters {
  real<lower=0> mu; 
  real<lower=0> sigma;
  real<lower=0> lambda0;
  simplex[2] theta;  
}

model {
    vector[2] lp;
    mu ~ normal(mu0, sigma0);
    sigma ~ cauchy(mu1, sigma1);
    lambda0 ~ normal(mu2, sigma2);
    theta ~ dirichlet([t1,t2]);
    
   for (n in 1:N) {
     lp[1] = normal_lpdf(y[n,1] | mu, sigma) T[L,U] + normal_lpdf(y[n,2] | mu, sigma) T[L,U];
     lp[2] = exponential_lpdf( y[n,1] | lambda0) T[L,U] + exponential_lpdf( y[n,2] | lambda0) T[L,U];     
     target += log_mix(theta, lp);                        
     };
}

I cannot make it to compile. I am naively defining a lpdf with the same notation as it is defined a sampling statement with the T[L,U] after the pdf. This is not working.

Can someone help me in how to define this model? I have the impression that the trick is to use transformed parameters, but I can’t figure out how to do it.

Thank you ! Ezequiel.

Ezequiel_Alvarez · September 26, 2024, 10:00pm

I have solved it, I leave the solution here for anyone who may find it useful!

The trick is to define the sampling PDF as the complete PDF divided the PDF area between the truncating limits. And express this as a function of the existing log-CDF and log-CCDF, and using the trick of log_diff_exp to express the log of a difference.

No one answered here, but I’m sure they would have do it after some time!

model {
    vector[2] lp;
    mu ~ normal(mu0, sigma0);
    sigma ~ cauchy(mu1, sigma1);
    lambda0 ~ normal(mu2, sigma2);
    theta ~ dirichlet([t1,t2]);
    
   for (n in 1:N) {
     lp[1] =   normal_lpdf(y[n,1] | mu, sigma)
             - log_diff_exp( normal_lcdf( U | mu, sigma), normal_lcdf( L | mu, sigma) )             
             + normal_lpdf(y[n,2] | mu, sigma)
             - log_diff_exp( normal_lcdf( U | mu, sigma), normal_lcdf( L | mu, sigma) );
     lp[2] =   exponential_lpdf( y[n,1] - L | lambda0)             
             - exponential_lcdf( U - L | lambda0) 
             + exponential_lpdf( y[n,2] - L| lambda0)
             - exponential_lcdf( U - L | lambda0);
     target += log_mix(theta, lp);                        
     };
}

Bob_Carpenter · October 3, 2024, 9:58pm

Thanks. I was just coming here to answer. I thought I’d swept up all the unanswered queries a couple days ago, but it looks like I missed yours. Sorry about that.

Your answer would work even better if we had proper log cdf implementations—as is, our implementations are not super stable in the tails. We couldn’t find good log cdf function implementations anywhere, but we’re open to suggestions.

You can also avoid simplexes and Dirichlet by recoding in terms of beta distributions as

real<lower=0, upper=1> theta;
...
theta ~ beta(t1, t2);
...
target += log_mix(theta, lp[1], lp[2]);

But that won’t generalize to more than 2 mixture components.

Topic		Replies	Views
Mixture model of truncated gaussians - toy model Modeling	1	468	July 25, 2019
Mixture Model Fitting Modeling mixture	6	802	May 31, 2020
Vectorizing Mixture Poisson model Modeling	3	453	April 22, 2019
Prior as a mixture of a Half-Normal and Uniform distribution Modeling	5	650	October 6, 2020
How to specify a mixture of doubly truncated Paretos Modeling	2	639	January 18, 2018

2D mixture of truncated distributions

Related topics