Informed prior for a bernouilli logormal mixture model - divergent transitions in STAN

Dear members of the group,

I post this request as I am trying a Bayesian implementation of the model described in this publication:

Moulton and Halsey, “A Mixture Model with Detection Limits for Regression Analyses of Antibody Response to Vaccine.” Biometrics 51(4)1570-1578

In essence, this is a simple Bernoulli / censored-lognormal mixture: pollutant level in a certain environment would be either null when the pollution-generating process is absent, or following a lognormal distribution when it is present. Because instruments used to measure pollution levels have a detection limit, an unobserved value (reported as < limit of detection by the instrument) could either be part of the “true zeroes”, or part of the lognormal distribution but censored. The parameters of this model are omega, the proportion of the population truly zero, and mu and sigma, the parameters of the lognormal distribution for the non-zero part of the population.

I wrote the corresponding model both in JAGS and STAN, and both models yield identical results.

My questions is about the prior distribution for omega, the proportion of the population truly zero.

A natural prior for omega is uniform [0,1], which works fairly well. However my colleagues and I sought to use a more informative prior, based on the proportion of unobserved values in the sample (Pu).

Our reflection was as follows: Pu is the sum of the proportion of true zeroes, and the proportion of non-zero censored values. Therefore Pu is an upper limit for omega (the proportion of true zeroes cannot be greater than Pu).

Because we only observe Pu in our sample, and the true population Pu might be greater than that, we cannot directly use Pu as an upper limit for omega, but we can use its sampling distribution.

So we came up with the following prior for omega :

Omega ~ uniform(0,Pu)

Pu ~ beta( N0 , N-N0 ), with N the sample size and N0 the number of unobserved values.

This prior works in both STAN and JAGS, with acceptable convergence diagnostics (trace plot, autocorrelation, Rhat). However, over half of the transitions in STAN are divergent. None of the pairs plot suggest any value of mu/sigma/omega/Pu leading to more divergences, and tweaking the adapt options does nothing. The proportion of divergent transitions becomes higher when omega is close to Pu. There no such issue with the uniform[0,1] is used.

Our question : is our basic idea flawed because we partly use the data to inform the prior, or did we simply code a sound idea wrong ?

The model code is provided below

Thanks for any insight.


   data{
          int<lower=0> N;
          int<lower=0> N0;
          real<lower=0> x[N];
          int<lower=0,upper=1> is_observed[N];
          
          }
          
          parameters{
          real mu;
          real logsigma;
          real<lower=0,upper=1> Pmax;
          real<lower=0,upper=1> omega;
          }
          
          transformed parameters {

          real <lower=0> sigma;
        
          sigma = exp(logsigma);
          
          }
          
          model{
          
          //priors
          
          mu~normal(0,3.16);
          
          logsigma~normal(-0.1744,0.6259421);
          
          Pmax~beta( N0 , ( N - N0 ) );
          
          omega~uniform( 0 , Pmax );
          
          
          //likelihood
          
          for (n in 1:N) {
          
          if (is_observed[n]==0)   // For censored values
          
          target += log_sum_exp( bernoulli_lpmf( 1 | omega ) ,  bernoulli_lpmf(0 | omega ) + lognormal_lcdf( x[n] | mu, sigma) ) ;
          
          else     // for observed values, classic lognormal probability density
          
          target += bernoulli_lpmf( 0 | omega ) + lognormal_lpdf( x[n] | mu, sigma);
          
          }
          
          }
          
          //  extracting the likelihood
          generated quantities {
            
            vector[N] log_lik;
            
            for (n in 1:N) {
            
          if (is_observed[n]==0)   // For censored values
          
         log_lik[n] = log_sum_exp( bernoulli_lpmf( 1 | omega ) ,  bernoulli_lpmf(0 | omega ) + lognormal_lcdf( x[n] | mu, sigma) ) ;
          
          else     // for observed values, classic lognormal probability density
          
          log_lik[n] = bernoulli_lpmf( 0 | omega ) + lognormal_lpdf( x[n] | mu, sigma);
       
            }
          
          }

Prior is meant for including information that is independent of your data so you shouldn’t need Pmax if you’re just going to estimate it from the data.

The reason for the divergencies is that you have a hard constraint which you didn’t declare in the parameters block. It should be like this:

      real<lower=0,upper=1> Pmax;
      real<lower=0,upper=Pmax> omega;

Note the upper limit on omega.

Thanks very much,

I understand the coding mistake, as well as the more methodological one.