Sampling from zero-inflated Beta distribution

I am struggling to write down the model to sample from a zero-inflated Beta distribution in Stan. Can anyone help me out with the code?

I am specifically trying to have X = \begin{cases} 0 & \ \text{with probability} \ \pi \\ \beta(a,b) & \ \text{with probability} \ 1-\pi \end{cases}

Can this be modelled correctly in Stan?

A useful way to see Stan code for a variety of response distributions is via the make_stancode function in brms. In this case:

library(brms)
df <- data.frame(x = 0, y = .5)
make_stancode(y ~ x, family = zero_inflated_beta(), data = df)

outputs model code including:

 /* zero-inflated beta log-PDF of a single response
   * Args:
   *   y: the response value
   *   mu: mean parameter of the beta distribution
   *   phi: precision parameter of the beta distribution
   *   zi: zero-inflation probability
   * Returns:
   *   a scalar to be added to the log posterior
   */
   real zero_inflated_beta_lpdf(real y, real mu, real phi, real zi) {
     row_vector[2] shape = [mu * phi, (1 - mu) * phi];
     if (y == 0) {
       return bernoulli_lpmf(1 | zi);
     } else {
       return bernoulli_lpmf(0 | zi) +
              beta_lpdf(y | shape[1], shape[2]);
     }
   }
  /* zero-inflated beta log-PDF of a single response
   * logit parameterization of the zero-inflation part
   * Args:
   *   y: the response value
   *   mu: mean parameter of the beta distribution
   *   phi: precision parameter of the beta distribution
   *   zi: linear predictor for zero-inflation part
   * Returns:
   *   a scalar to be added to the log posterior
   */
   real zero_inflated_beta_logit_lpdf(real y, real mu, real phi, real zi) {
     row_vector[2] shape = [mu * phi, (1 - mu) * phi];
     if (y == 0) {
       return bernoulli_logit_lpmf(1 | zi);
     } else {
       return bernoulli_logit_lpmf(0 | zi) +
              beta_lpdf(y | shape[1], shape[2]);
     }
   }
  // zero-inflated beta log-CCDF and log-CDF functions
  real zero_inflated_beta_lccdf(real y, real mu, real phi, real zi) {
    row_vector[2] shape = [mu * phi, (1 - mu) * phi];
    return bernoulli_lpmf(0 | zi) + beta_lccdf(y | shape[1], shape[2]);
  }
  real zero_inflated_beta_lcdf(real y, real mu, real phi, real zi) {
    return log1m_exp(zero_inflated_beta_lccdf(y | mu, phi, zi));
  }

The idea here is to marginalize over two distinct possibilities; that of being in the zero-inflated state and that of being in the non-zero state.

Note that the above code snippets are generated by brms and therefore licensed under GPL.

1 Like