Question About Implementation of Log Likelihood for a Zero-Inflated Beta Binomial Model in R

In my search for a solution to the issue I’ve been having with parallel computation when working with brms’ custom families functionality and some issues I encountered with the beta binomial model severely overpredicting the number of zeros in my analysis of women’s cabinet representation, I’ve been working on implementing native support for zero-inflated beta binomial regression in brms (also addresses issue #1243).

Following the instructions provided by @paul.buerkner in the custom families vignette, I’ve managed to get most of the required functions written up and my test case model converges and posterior predictive checks look good (or at least better than they did with the non-zero inflated beta binomial). However, I seem to have run into an issue as to how to properly specify the function for calculating the log likelihood.

In a custom family based on the vignette, it should be something like this

# Define a function to calculate the log likelihood----
log_lik_zero_inflated_beta_binomial <- function(i, prep) {
  mu <- brms::get_dpar(prep, "mu", i = i)
  phi <- brms::get_dpar(prep, "phi", i = i)
  zi <- brms::get_dpar(prep, "zi", i = i)
  trials <- prep$data$trials[i]
  y <- prep$data$Y[i]
  zero_inflated_beta_binomial_lpmf(y, trials, mu, phi, zi)

where zero_inflated_beta_binomial_lpmf is

"/* zero-inflated beta binomial log-PDF of a single response
* Args: 
*   y: the response value 
*   trials: number of trials of the binomial part
*   mu: location parameter of the beta binomial part
*   phi: dispersion parameter of the beta binomial part
*   zi: zero-inflation probability
* Returns:
*   a scalar to be added to the log posterior 
  real zero_inflated_beta_binomial_lpmf(
    int y, int trials, real mu, real phi, real zi
  ) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_lpmf(1 | zi), 
                         bernoulli_lpmf(0 | zi) + 
                           beta_binomial_lpmf(0 | trials, mu * phi, (1 - mu) * phi));
    } else { 
      return bernoulli_lpmf(0 | zi) +
        beta_binomial_lpmf(y | trials, mu * phi, (1 - mu) * phi);

and needs to be loaded into R via expose_functions. For native support, however, that last line needs to be rewritten in R instead of Stan and I’m not quite sure what the best way to do that is and I’m hoping someone here might have a suggestion.

Attached is the code for both the version using custom_family and what I’ve written and tested of the functions required for native support.
06_ZI_Beta_Binomial_Family.R (7.6 KB)

1 Like

Not sure I understand the question: is the problem that you don’t know what the R counterparts for the relevant Stan functions is? If that’s the case then I can help: beta_binomial_lpmf should be available from brms, binomial_lpmf is available in base R via dbinom(..., log = TRUE), bernoulli is a special case of binomial. log_sum_exp is either available via brms (can’t check I am on phone), but if not, it is very short and an R version is easy to Google.

Does that answer your question?

Best of luck with the custom family!