How to specify a mixture of doubly truncated Paretos

Hey all,

I have a fairly simple model I’d like to implement in Stan. It’s a mixture of doubly truncated Pareto distributions. K mixture components have their own ymin and alpha, but they share an upper truncation parameter ymax.

data {
    int<lower=1> K; // number of mixture components
    int<lower=0> N; // number of planets
    real y[N]; // y values for N planets
parameters {
    simplex[K] theta; // mixing proportions
    real alpha[K]; // power-law indices
    real ymin[K]; // ymin for each component
    real ymax; // ymax for all K components
model {
    real ps[K];
    real ylower[K];
    real yupper[K];
    real integral[K];
    ylower[1] = ymin[1]; 
    ylower[2] = ymin[2]; 
    yupper[1] = ymax;
    yupper[2] = ymax;
    alpha ~ uniform(-10.,10.); // prior on alpha
    ymin ~ uniform(0., 30.); // prior on ymin
    ymax ~ uniform(ymin[2], 30.); // prior on ymax

    integral[1] = (yupper[1]^(alpha[1]) - ylower[1]^(alpha[1]))/alpha[1];
    integral[2] = (yupper[2]^(alpha[2]) - ylower[2]^(alpha[2]))/alpha[2];

    for (n in 1:N) { // loop over planets
        for (k in 1:K){ // loop over mixture components
            if ((y[n] > ylower[k]) && (y[n] < yupper[k]))
                ps[k] = theta[k] * (y[n])^(alpha[k]-1.)/integral[k];
                ps[k] = 0.;
        target += log(sum(ps));

I realize Stan supports Pareto distributions (y ~ pareto(ymin, alpha)), but I’d like to allow for monotonically increasing distributions, i.e., a Pareto with a positive exponent, alpha<-1.

The above code works, but I doubt it’s an efficient implementation. For N=1000, it takes roughly ~0.1s per model evaluation. Any advice for speeding up this Stan model?

For mixture models you’re going to want to have ordered means on the different mixture distributions to make sure you don’t have the label switching identifiability (check out Betancourts notebook on this for more information). It’s possible this non-identifiability will slow you down by making you have to take more leap-frog steps. Since you’re dealing with a Pareto distribution you may want to re-parameterize so you can do an ordering. This is not as straight-forward as ordering the means of Gaussians.

As far as I know for mixture components you have to use a for loop as of now like you did. I wonder if having a likelihood that is discontinuous in the parameters is going to cause problems for HMC. If your ylower parameter changes to incorporate a new data point in the if rather than the else, that’ll cause a jump in the likelihood. What do your trace plots look like and how is your tree depth?

Yes, this can be a problem when the Hamiltonian simulator tries to cross the boundary. If the energy discrepancy is too high, it’ll just diverge (and thus get stuck on one side of the boundary).