Poisson-LogSkewNormal mixture + partial pooling

kbchoi · October 13, 2017, 12:53am

I have the following two-component mixture model. It is quite similar to zero-inflated model: one component is for a cluster of small values (but not necessarily zero), and the other is for relatively larger values with extremely large variance.

functions {
    real continuous_poisson_log_lpdf(real n, real alpha) {
        return n*alpha - exp(alpha) - lgamma(n+1);
    }
}
data {
    int<lower=1> N;
    real<lower=0> n[N];
    real<lower=0> mu_init;
    vector<lower=0>[N] C; // offsets
}
transformed data {
    vector[N] logC = log(C); 
}
parameters {
    simplex[2] xi;
    real<lower=0,upper=1> lambda_zero;
    vector<lower=0>[N] lambda;
    real<lower=0> mu;
    real<lower=0> sigma;
    real alpha;
}
model {
    vector[2] log_xi = log(xi);
    mu ~ normal(mu_init, 1);
    sigma ~ normal(0, 1);
    alpha ~ normal(0, 1);
    lambda ~ skew_normal(mu, sigma, alpha);
    for (i in 1:N) {
        vector[2] lps = log_xi;
        lps[1] = lps[1] + continuous_poisson_log_lpdf(n[i]|lambda_zero+logC[i]);
        lps[2] = lps[2] + continuous_poisson_log_lpdf(n[i]|lambda[i]+logC[i]);
        target += log_sum_exp(lps);
    }
}

Here’s the output.

Inference for Stan model: anon_model_b62cdae42a3f6a2a1b4da6046463c0d1.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

              mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
xi[0]         0.04  1.5e-3   0.01   0.02   0.03   0.04   0.05   0.07   75.0   1.06
xi[1]         0.96  1.5e-3   0.01   0.93   0.95   0.96   0.97   0.98   75.0   1.06
lambda_zero   0.77    0.09    0.3    0.1   0.54    1.0    1.0    1.0   11.0   1.55
lambda[0]     7.39  5.1e-4   0.02   7.36   7.38   7.39    7.4   7.42  890.0    1.0
lambda[1]      7.5  4.6e-4   0.01   7.47   7.49    7.5   7.51   7.53  964.0   1.01
...
lambda[285]   7.52  1.0e-3   0.02   7.48   7.51   7.52   7.54   7.56  488.0    1.0
mu            8.14  8.5e-3   0.07   8.01   8.09   8.14   8.18   8.29   72.0   1.06
sigma         1.78    0.04   0.17   1.56   1.67   1.75   1.85   2.25   21.0   1.21
alpha        -3.98    0.07   0.52  -5.07  -4.32  -3.95  -3.63  -3.02   56.0   1.07
lp__        -1.5e4  4795.7  2.2e4 -9.7e4 -2.0e4 -1.2e4  -1013 -990.7   21.0    1.2

Samples were drawn using NUTS at Thu Oct 12 16:53:22 2017.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

With ~300 data points, it took about 20 miniute on my AMD Operon 6136 (512MB RAM) machine. Apparently n_eff and Rhat do not look great. Here are my questions.

(1) What do you recommend to resolve n_eff or Rhat issue in general? I have tried different distributions, e.g., Poisson-LogNormal or Poisson-LogGamma. I have also tried different parametrization on those distributions. But neither helped.

(2) Can lambda ~ skew_normal(mu, sigma, alpha); be there outside of the for-loop even though lambda’s are only used in the second component of the mixture?

(3) Does 20 minute sound reasonable for this mixture to run 2,000 iterations? What would possibly speed up this model?

(4) What is your opinion about generalizing poisson distribution onto continuous domain like I am doing here?

Any comment would be greatly appreciated.

Bob_Carpenter · October 14, 2017, 9:34pm

I didn’t even know Stan could run in 512MB RAM. How do you compile your models? Does it take forever with swapping?

Your model is fitting a Poisson parameter for each item.

kbchoi:

        vector[2] lps = log_xi;
        lps[1] = lps[1] + continuous_poisson_log_lpdf(n[i]|lambda_zero+logC[i]);
        lps[2] = lps[2] + continuous_poisson_log_lpdf(n[i]|lambda[i]+logC[i]);
        target += log_sum_exp(lps);

can be simplified to

target += log_mix(xi, 
    continuous_poisson_log_lpdf(n[i] | lambda_zero + logC[i]),
    continuous_poisson_log_lpdf(n[i] | lambda[i] + logC[i]));

The problem you’re having with R-hat is most likely due to the centered parameterization of the skew-normal. I don’t know how to reparameterize that, but you want to look at doing the same thing as we normally do fro non-centered parameterizations.

Bob_Carpenter · October 14, 2017, 9:36pm

Oops, didn’t answer whole thing. Yes, this should be outside for loop, but it should be translated to a non-centered parameterization. You might want to try fitting with a regular normal first and then generalize to skew normal.

Getting the right parameterization helps immensely because then the step size can be reasonable.

No opinion, but why not use a continuous distribution to begin with?

kbchoi · October 14, 2017, 11:40pm

Urgh…I meant 512GB. Thank you for your comments.

Topic		Replies	Views
Mixture Bayesian Poisson Regression Model Modeling	45	3904	April 17, 2019
Mixture of generalized poisson distribution Modeling mixture , poisson	18	1968	December 26, 2022
HMM and Skew Normal mixture Modeling rstan , hmm , mixture	1	409	June 9, 2023
Estimating parameters of a negative-binomial mixture model Modeling	35	2543	July 10, 2019
Varying Effects Zero-Inflated Mixture Model runs slowly Modeling rstan , techniques , performance	6	640	March 30, 2021

Poisson-LogSkewNormal mixture + partial pooling

Related topics