Zero-inflated prior or mixture distreibution for prior in Stan

Hi,

Recently I am considering applying some flexible priors to one of the unknown parameters in the model. The parameters could be either exact 0 or some positive values, so I am considering whether I could use a mixture of 0 and exponential distribution as the prior for that parameter. The Stan User’s guide provides the implementation of zero-inflated model 5.6 Zero-inflated and hurdle models | Stan User’s Guide for observed data through likelihood, but now I would like to do similar things for prior distribution and the likelihood-based coding syntax would not work here since we don’t observe any of the values of that parameter.

Given the complexity of modeling discrete parameters in Stan, may I ask that is there any easy implementation of a mixture of 0 and exponential distribution as the prior for that parameter in Stan?

Thanks so much for your help!

There are a variety of sparsity inducing priors that have been used with Stan, and a good summary of them here Sparsity Blues

If you really want zero inflation for some reason, you can write out the entire model with the parameter included, the entire model with the parameter fixed to zero, an put them both inside a ‘log_mix’ statement with whatever prior you desire on the mixing parameter. If you do this, make sure not to drop different normalizing constants on either side of the mixture.

2 Likes

Thanks so much for your reply!

For your second point, I am still a little bit confused about how to implement this in Stan. The following is some example code and

model {
  vector[N] y;
} 
parameters {
  real<lower=0> mu; 
  real<lower=0> mu_exp; 
  real<lower=0> sigma; 
} 
model {
 sigma ~ normal(0, 1);
 mu ~ exponential(mu_exp);
 mu_exp ~ normal(0, 2);
 y ~ normal(mu, sigma); 
} 

Suppose now I would like to assign a mixture of 0 and exponential distribution to parameter mu, may I ask how should I modify my Stan code accordingly?

Thanks so much for your help!

We could rewrite the model block as follows

target += normal_lpdf(sigma |  0, 1);
target += exponential_lpdf(mu | mu_exp);
target += normal_lpdf(mu_exp | 0, 2);
target += normal_lpdf(y | mu, sigma);

We can rewrite this again as:

real lp = 0;
lp += normal_lpdf(sigma |  0, 1);
lp += exponential_lpdf(mu | mu_exp);
lp += normal_lpdf(mu_exp | 0, 2);
lp += normal_lpdf(y | mu, sigma);
target += lp;

Now we can write the mixture model.

parameters {
  ...
  real<lower = 0, upper = 1> p; // the mixture probability
}
model {
  // add some suitable prior on p here


  // the non-zero part
  real lp1 = 0;
  lp1 += normal_lpdf(sigma |  0, 1);
  lp1 += exponential_lpdf(mu | mu_exp);  // make sure to use lpdf and not lupdf here!
  lp1 += normal_lpdf(mu_exp | 0, 2); // and here too!
  lp1 += normal_lpdf(y | mu, sigma);  
  
  // the zero part
  real lp2 = 0;
  lp2 += normal_lpdf(sigma |  0, 1);
  lp2 += normal_lpdf(y | 0, sigma);
  
  // the mixture part
  target += log_mix(p, lp1, lp2);
}
1 Like

Thanks so much for your kind reply!

I haven’t thought that we could define variables (such as lp1 and lp2) in the ‘model’ block. I used to define such items in the ‘transformed’ parameter block as following:

parameters {
  ...
  real<lower = 0, upper = 1> p; // the mixture probability
}
transformed parameters{
  // the non-zero part
  real lp1 = 0;
  lp1 += normal_lpdf(sigma |  0, 1);
  lp1 += exponential_lpdf(mu | mu_exp);  // make sure to use lpdf and not lupdf here!
  lp1 += normal_lpdf(mu_exp | 0, 2); // and here too!
  lp1 += normal_lpdf(y | mu, sigma);  

  // the zero part
  real lp2 = 0;
  lp2 += normal_lpdf(sigma |  0, 1);
  lp2 += normal_lpdf(y | 0, sigma);
}
model {
  // add some suitable prior on p here
  // the mixture part
  target += log_mix(p, lp1, lp2);
}

May I ask that does my code serve the same purpose as yours?

Thanks so much!

The only difference is that ones in the transformed parameters block are saved in the output. If you do want these in transformed parameters, I’d suggest coding this more simply as a direct sum.

real lp_nonzero = normal_lpdf(sigma |  0, 1)
    + exponential_lpdf(mu | mu_exp)
    + normal_lpdf(mu_exp | 0, 2)
    + normal_lpdf(y | mu, sigma);  
real lp_zero =normal_lpdf(sigma |  0, 1)
    + normal_lpdf(y | 0, sigma);

I renamed the variables and removed the comments.

1 Like