Posterior predictive check (RNG) for Zero-One Inflated Beta

i am working with a ZOIB model in cmdstanr and would like to add predictions to generated quantities.

I’m using the following function for ZOIB

real zero_one_inflated_beta_lpdf(real y, real mu, real phi, real zoi,
                                                 real coi) {
                        row_vector[2] shape = [mu * phi, (1 - mu) * phi];
                        if (y == 0) {
                                return bernoulli_lpmf(1 | zoi) + bernoulli_lpmf(0 | coi);
                        } else if (y == 1) {
                                return bernoulli_lpmf(1 | zoi) + bernoulli_lpmf(1 | coi);
                        } else {
                                return bernoulli_lpmf(0 | zoi) + beta_lpdf(y | shape[1], shape[2]);

For a zero-inflated negative-binomial model, I used this:

for (n in 1 : N) {
            y_pred[n]=(1-zero)*neg_binomial_2_log_rng(mu[n], shape);

Is the following a correct adaptation for this ZOIB model?

for (i in 1 : N) {
        zero = bernoulli_rng(zoi);
        one = bernoulli_rng(coi);
        ypred[i] = one ? 1 : (1-zero) * beta_rng(mu[i], phi);

For the zero-inflated negative binomial, I’d recommend coding more like in your ZOIB example:

for (n in 1:N) {
  y_pred[n] = bernoulli_logit_rng(zi[n]) ? neg_binomail_2_log_rng(mu[n], shape) : 0;

I think there’s a problem with the way you code the zero and one inflated negative binomial. I don’t think you can get away with independent bernoulli variates for 0-inflation and 1-inflation because then it makes sense for them both to be true. I think what you want is a process that looks like this:

if (bernoulli(zero_inflation)) y = 0
else if (bernoulli(one_inflation)) y = 1
else y = beta_rng(...)

So implicitly, the probability of one inflation is one_infliation * (1 - zero_inflation).

So I think you second line should be 0|zoi + 1 | com. Otherwise that looks right to me given that the beta can’t return 0 or 1 other than by underflow or rounding.

I’m a bit confused by beta_rng(mu[I], phi) as we parameterize in terms of two scales here unless you use the alternative mean plus total scale, which has a different name.

I wasn’t sure what you were doing in the code with zero and one. In general, code is easier to follow when you declare variables close to where they’re used. For scalars, there’s no memory advantage to predeclaring like for containers.


When you said "So I think you second line should be 0|zoi + 1 | com. " Which second line? In the _lpdf function?