Help predicting from zero-inflated binomial

JLC · November 29, 2021, 6:19pm

I am working on a zero-inflated binomial model with tremendous assistance from brms. My data are single trial response vs no-response, but there are a lot of zeros.

I’m trying to fit the model within cmdstanr, which is fine as-is, but I would also like to do some posterior predictive checks.

brms uses some custom functions for this model

/* zero-inflated binomial log-PDF of a single response 
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   theta: probability parameter of the binomial part
   *   zi: zero-inflation probability
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_lpmf(int y, int trials, 
                                   real theta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_lpmf(1 | zi), 
                         bernoulli_lpmf(0 | zi) + 
                         binomial_lpmf(0 | trials, theta)); 
    } else { 
      return bernoulli_lpmf(0 | zi) +  
             binomial_lpmf(y | trials, theta); 
    } 
  }
  /* zero-inflated binomial log-PDF of a single response 
   * logit parameterization of the zero-inflation part
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   theta: probability parameter of the binomial part 
   *   zi: linear predictor for zero-inflation part 
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_logit_lpmf(int y, int trials, 
                                         real theta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_logit_lpmf(1 | zi), 
                         bernoulli_logit_lpmf(0 | zi) + 
                         binomial_lpmf(0 | trials, theta)); 
    } else { 
      return bernoulli_logit_lpmf(0 | zi) +  
             binomial_lpmf(y | trials, theta); 
    } 
  }
  /* zero-inflated binomial log-PDF of a single response 
   * logit parameterization of the binomial part
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   eta: linear predictor for binomial part 
   *   zi: zero-inflation probability
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_blogit_lpmf(int y, int trials, 
                                          real eta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_lpmf(1 | zi), 
                         bernoulli_lpmf(0 | zi) + 
                         binomial_logit_lpmf(0 | trials, eta)); 
    } else { 
      return bernoulli_lpmf(0 | zi) +  
             binomial_logit_lpmf(y | trials, eta); 
    } 
                                          }
  /* zero-inflated binomial log-PDF of a single response 
   * logit parameterization of the binomial part
   * logit parameterization of the zero-inflation part
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   eta: linear predictor for binomial part 
   *   zi: linear predictor for zero-inflation part 
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_blogit_logit_lpmf(int y, int trials, 
                                                real eta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_logit_lpmf(1 | zi), 
                         bernoulli_logit_lpmf(0 | zi) + 
                         binomial_logit_lpmf(0 | trials, eta)); 
    } else { 
      return bernoulli_logit_lpmf(0 | zi) +  
             binomial_logit_lpmf(y | trials, eta); 
    } 
  }
  // zero-inflated binomial log-CCDF and log-CDF functions
  real zero_inflated_binomial_lccdf(int y, int trials, real theta, real zi) { 
    return bernoulli_lpmf(0 | zi) + binomial_lccdf(y | trials, theta); 
  }
  real zero_inflated_binomial_lcdf(int y, int trials, real theta, real zi) { 
    return log1m_exp(zero_inflated_binomial_lccdf(y | trials, theta, zi));
  }

I want to add some predictions into the generated quantities block, so I’m trying to define a zero_inflated_binomial_blogit_rng function, but what I’m doing doesn’t seem to be working.

I added this to functions:

 real zero_inflated_binomial_blogit_rng(int y, int trials, 
                                          real eta, real zi) {
  if (y == 0) { 
      return log_sum_exp(bernoulli_logit_lpmf(1 | zi), 
                         bernoulli_logit_lpmf(0 | zi) + 
                         binomial_logit_lpmf(0 | trials, eta)); 
    } 
    if (zi < 0 ) {
    return bernoulli_rng(0) +  
             binomial_rng(trials, inv_logit(eta));
    }
    if (zi > 1 ) {
    return bernoulli_rng(1) +  
             binomial_rng(trials, inv_logit(eta));
    }
    else {
      return bernoulli_rng(zi) +  
             binomial_rng(trials, inv_logit(eta)); 
    } 
                                          }

And this to generated quantities:

for (n in 1:N) {
    if (zi_pred[n] < 0) Y_pred[n] = zero_inflated_binomial_blogit_rng(Y[n], trials[n], mu_pred[n], 0);
    if (zi_pred[n] > 1) Y_pred[n] = zero_inflated_binomial_blogit_rng(Y[n], trials[n], mu_pred[n], 1);
    else
      Y_pred[n] = zero_inflated_binomial_blogit_rng(Y[n], trials[n], mu_pred[n], zi_pred[n]);
    }

But as defined in the function, this is giving me reals rather than binary predictions.

Can anyone suggest how I can create something to give me predicted 1’s and 0’s?

JLC · December 2, 2021, 7:30pm

I looked through how brms handles predictions for these models and found this:

posterior_predict_zero_inflated_binomial <- function(i, prep, ...) {
  # theta is the bernoulii zero-inflation parameter
  theta <- get_dpar(prep, "zi", i = i)
  trials <- prep$data$trials[i]
  prob <- get_dpar(prep, "mu", i = i)
  ndraws <- prep$ndraws
  # compare with theta to incorporate the zero-inflation process
  zi <- runif(ndraws, 0, 1)
  ifelse(zi < theta, 0, rbinom(ndraws, size = trials, prob = prob))
}

So I added the following to generated quantities:

  theta[N] = uniform_rng(0,1);
 for (n in 1:N) {
      // add more terms to the linear predictor
      mu_pred[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n] + r_1_4[J_1[n]] * Z_1_4[n] + r_1_5[J_1[n]] * Z_1_5[n];
      // add more terms to the linear predictor
      zi_pred[n] += r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n] + r_2_zi_4[J_2[n]] * Z_2_zi_4[n] + r_2_zi_5[J_2[n]] * Z_2_zi_5[n];
    if (theta[n] < zi_pred[n]) Y_pred[n] = 0;
    else
    Y_pred[n] = binomial_rng(trials[n], inv_logit(mu_pred[n]));
    }

But, the predictions are very different from using posterior_predict.

Can anyone spot what I’m doing wrong?

ajnafa · December 2, 2021, 10:03pm

Why not just set backend = "cmdstanr" in brm and use brms’ convivence functions to do the predictions after the fact in R?

JLC · December 2, 2021, 10:13pm

I’m trying to get a better understanding of how it’s working by working through the process, so I can make changes to the generated Stan code.

JLC · December 3, 2021, 1:24am

I’m processing the predictions two ways using draws from the fitted model

pred_samples <- draws_of(as_draws_rvars(Fit$draws())$Y_pred)
prob <- draws_of(as_draws_rvars(Fit$draws())$prob_pred)
theta <- draws_of(as_draws_rvars(Fit$draws())$theta_pred)

I’m looking at Y_Pred directly

or processing the prob and theta parameters to calculate a prediction

Preds <- cbind(as_tibble(t(prob)) %>%
        pivot_longer(`1`:`8000`, names_to = ".draw", values_to = "prob") %>% 
        mutate(.draw = as.numeric(.draw),
               prob = plogis(prob)) %>%
        ungroup(),  
        as_tibble(t(theta)) %>%
        pivot_longer(`1`:`8000`, names_to = ".draw", values_to = "theta") %>% 
        mutate(.draw = as.numeric(.draw),
               theta = plogis(theta)) %>%
        ungroup())

Predictions <- Preds  %>%
        mutate(zi = runif(nrow(.), 0, 1),
               Pred = ifelse(zi < theta , 0, rbinom(1, size = 1, prob = prob)))

@paul.buerkner can you see where I’m going wrong?

I’m using the logit link for both mu and zi

functions {
 /* compute correlated group-level effects
  * Args: 
  *   z: matrix of unscaled group-level effects
  *   SD: vector of standard deviation parameters
  *   L: cholesky factor correlation matrix
  * Returns: 
  *   matrix of scaled group-level effects
  */ 
  matrix scale_r_cor(matrix z, vector SD, matrix L) {
    // r is stored in another dimension order than z
    return transpose(diag_pre_multiply(SD, L) * z);
  }
  /* zero-inflated binomial log-PDF of a single response 
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   theta: probability parameter of the binomial part
   *   zi: zero-inflation probability
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_lpmf(int y, int trials, 
                                   real theta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_lpmf(1 | zi), 
                         bernoulli_lpmf(0 | zi) + 
                         binomial_lpmf(0 | trials, theta)); 
    } else { 
      return bernoulli_lpmf(0 | zi) +  
             binomial_lpmf(y | trials, theta); 
    } 
  }
  /* zero-inflated binomial log-PDF of a single response 
   * logit parameterization of the zero-inflation part
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   theta: probability parameter of the binomial part 
   *   zi: linear predictor for zero-inflation part 
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_logit_lpmf(int y, int trials, 
                                         real theta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_logit_lpmf(1 | zi), 
                         bernoulli_logit_lpmf(0 | zi) + 
                         binomial_lpmf(0 | trials, theta)); 
    } else { 
      return bernoulli_logit_lpmf(0 | zi) +  
             binomial_lpmf(y | trials, theta); 
    } 
  }
  /* zero-inflated binomial log-PDF of a single response 
   * logit parameterization of the binomial part
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   eta: linear predictor for binomial part 
   *   zi: zero-inflation probability
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_blogit_lpmf(int y, int trials, 
                                          real eta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_lpmf(1 | zi), 
                         bernoulli_lpmf(0 | zi) + 
                         binomial_logit_lpmf(0 | trials, eta)); 
    } else { 
      return bernoulli_lpmf(0 | zi) +  
             binomial_logit_lpmf(y | trials, eta); 
    } 
                                          }
  /* zero-inflated binomial log-PDF of a single response 
   * logit parameterization of the binomial part
   * logit parameterization of the zero-inflation part
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   eta: linear predictor for binomial part 
   *   zi: linear predictor for zero-inflation part 
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_blogit_logit_lpmf(int y, int trials, 
                                                real eta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_logit_lpmf(1 | zi), 
                         bernoulli_logit_lpmf(0 | zi) + 
                         binomial_logit_lpmf(0 | trials, eta)); 
    } else { 
      return bernoulli_logit_lpmf(0 | zi) +  
             binomial_logit_lpmf(y | trials, eta); 
    } 
  }
  // zero-inflated binomial log-CCDF and log-CDF functions
  real zero_inflated_binomial_lccdf(int y, int trials, real theta, real zi) { 
    return bernoulli_lpmf(0 | zi) + binomial_lccdf(y | trials, theta); 
  }
  real zero_inflated_binomial_lcdf(int y, int trials, real theta, real zi) { 
    return log1m_exp(zero_inflated_binomial_lccdf(y | trials, theta, zi));
  }
}
data {
  int<lower=1> N;  // total number of observations
  int Y[N];  // response variable
  int trials[N];  // number of trials
  int<lower=1> K;  // number of population-level effects
  matrix[N, K] X;  // population-level design matrix
  int<lower=1> K_zi;  // number of population-level effects
  matrix[N, K_zi] X_zi;  // population-level design matrix
  // data for group-level effects of ID 1
  int<lower=1> N_1;  // number of grouping levels
  int<lower=1> M_1;  // number of coefficients per level
  int<lower=1> J_1[N];  // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_1_1;
  vector[N] Z_1_2;
  vector[N] Z_1_3;
  vector[N] Z_1_4;
  vector[N] Z_1_5;
  int<lower=1> NC_1;  // number of group-level correlations
  // data for group-level effects of ID 2
  int<lower=1> N_2;  // number of grouping levels
  int<lower=1> M_2;  // number of coefficients per level
  int<lower=1> J_2[N];  // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_2_zi_1;
  vector[N] Z_2_zi_2;
  vector[N] Z_2_zi_3;
  vector[N] Z_2_zi_4;
  vector[N] Z_2_zi_5;
  int<lower=1> NC_2;  // number of group-level correlations
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  int Kc = K - 1;
  matrix[N, Kc] Xc;  // centered version of X without an intercept
  vector[Kc] means_X;  // column means of X before centering
  int Kc_zi = K_zi - 1;
  matrix[N, Kc_zi] Xc_zi;  // centered version of X_zi without an intercept
  vector[Kc_zi] means_X_zi;  // column means of X_zi before centering
  for (i in 2:K) {
    means_X[i - 1] = mean(X[, i]);
    Xc[, i - 1] = X[, i] - means_X[i - 1];
  }
  for (i in 2:K_zi) {
    means_X_zi[i - 1] = mean(X_zi[, i]);
    Xc_zi[, i - 1] = X_zi[, i] - means_X_zi[i - 1];
  }
}
parameters {
  vector[Kc] b;  // population-level effects
  real Intercept;  // temporary intercept for centered predictors
  vector[Kc_zi] b_zi;  // population-level effects
  real Intercept_zi;  // temporary intercept for centered predictors
  vector<lower=0>[M_1] sd_1;  // group-level standard deviations
  matrix[M_1, N_1] z_1;  // standardized group-level effects
  cholesky_factor_corr[M_1] L_1;  // cholesky factor of correlation matrix
  vector<lower=0>[M_2] sd_2;  // group-level standard deviations
  matrix[M_2, N_2] z_2;  // standardized group-level effects
  cholesky_factor_corr[M_2] L_2;  // cholesky factor of correlation matrix
}
transformed parameters {
  matrix[N_1, M_1] r_1;  // actual group-level effects
  // using vectors speeds up indexing in loops
  vector[N_1] r_1_1;
  vector[N_1] r_1_2;
  vector[N_1] r_1_3;
  vector[N_1] r_1_4;
  vector[N_1] r_1_5;
  matrix[N_2, M_2] r_2;  // actual group-level effects
  // using vectors speeds up indexing in loops
  vector[N_2] r_2_zi_1;
  vector[N_2] r_2_zi_2;
  vector[N_2] r_2_zi_3;
  vector[N_2] r_2_zi_4;
  vector[N_2] r_2_zi_5;
  // compute actual group-level effects
  r_1 = scale_r_cor(z_1, sd_1, L_1);
  r_1_1 = r_1[, 1];
  r_1_2 = r_1[, 2];
  r_1_3 = r_1[, 3];
  r_1_4 = r_1[, 4];
  r_1_5 = r_1[, 5];
  // compute actual group-level effects
  r_2 = scale_r_cor(z_2, sd_2, L_2);
  r_2_zi_1 = r_2[, 1];
  r_2_zi_2 = r_2[, 2];
  r_2_zi_3 = r_2[, 3];
  r_2_zi_4 = r_2[, 4];
  r_2_zi_5 = r_2[, 5];
}
model {
  // likelihood not including constants
  if (!prior_only) {
    // initialize linear predictor term
    vector[N] mu = Intercept + Xc * b;
    // initialize linear predictor term
    vector[N] zi = Intercept_zi + Xc_zi * b_zi;
    for (n in 1:N) {
      // add more terms to the linear predictor
      mu[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n] + r_1_4[J_1[n]] * Z_1_4[n] + r_1_5[J_1[n]] * Z_1_5[n];
    }
    for (n in 1:N) {
      // add more terms to the linear predictor
      zi[n] += r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n] + r_2_zi_4[J_2[n]] * Z_2_zi_4[n] + r_2_zi_5[J_2[n]] * Z_2_zi_5[n];
    }
    for (n in 1:N) {
      target += zero_inflated_binomial_blogit_logit_lpmf(Y[n] | trials[n], mu[n], zi[n]);
    }
  }
  // priors not including constants
  target += normal_lupdf(b | 0,1);
  target += normal_lupdf(Intercept | 0,1);
  target += logistic_lupdf(Intercept_zi | 0, 1);
  target += exponential_lupdf(sd_1 | 1);
  target += std_normal_lupdf(to_vector(z_1));
  target += lkj_corr_cholesky_lupdf(L_1 | 1);
  target += student_t_lupdf(sd_2 | 3, 0, 2.5);
  target += std_normal_lupdf(to_vector(z_2));
  target += lkj_corr_cholesky_lupdf(L_2 | 1);
}
generated quantities {
  vector[N] prob_pred = Intercept + Xc * b;
   vector[N] theta_pred = Intercept_zi + Xc_zi * b_zi;
      vector[N] Y_pred; 
   vector[N] zi; 
   zi[N] = uniform_rng(0,1); 
  // actual population-level intercept
  real b_Intercept = Intercept - dot_product(means_X, b);
  // actual population-level intercept
  real b_zi_Intercept = Intercept_zi - dot_product(means_X_zi, b_zi);
  // compute group-level correlations
  corr_matrix[M_1] Cor_1 = multiply_lower_tri_self_transpose(L_1);
  vector<lower=-1,upper=1>[NC_1] cor_1;
  // compute group-level correlations
  corr_matrix[M_2] Cor_2 = multiply_lower_tri_self_transpose(L_2);
  vector<lower=-1,upper=1>[NC_2] cor_2;
  // extract upper diagonal of correlation matrix
  for (k in 1:M_1) {
    for (j in 1:(k - 1)) {
      cor_1[choose(k - 1, 2) + j] = Cor_1[j, k];
    }
  }
  // extract upper diagonal of correlation matrix
  for (k in 1:M_2) {
    for (j in 1:(k - 1)) {
      cor_2[choose(k - 1, 2) + j] = Cor_2[j, k];
    }
  }
  for (n in 1:N) {
      // add more terms to the linear predictor
      prob_pred[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n] + r_1_4[J_1[n]] * Z_1_4[n] + r_1_5[J_1[n]] * Z_1_5[n];
      // add more terms to the linear predictor
      theta_pred[n] += r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n] + r_2_zi_4[J_2[n]] * Z_2_zi_4[n] + r_2_zi_5[J_2[n]] * Z_2_zi_5[n];
    if (zi[n] < theta_pred[n]) Y_pred[n] = 0;
    else
    Y_pred[n] = binomial_rng(trials[n], inv_logit(prob_pred[n]));
    }
}

JLC · December 5, 2021, 12:20am

In my latest attempt, I move the mu and zi parameters to the transformed parameters block, so I could recover those directly:

transformed parameters {
  matrix[N_1, M_1] r_1;  // actual group-level effects
  // using vectors speeds up indexing in loops
  vector[N_1] r_1_1;
  vector[N_1] r_1_2;
  vector[N_1] r_1_3;
  vector[N_1] r_1_4;
  vector[N_1] r_1_5;
  matrix[N_2, M_2] r_2;  // actual group-level effects
  // using vectors speeds up indexing in loops
  vector[N_2] r_2_zi_1;
  vector[N_2] r_2_zi_2;
  vector[N_2] r_2_zi_3;
  vector[N_2] r_2_zi_4;
  vector[N_2] r_2_zi_5;
  // compute actual group-level effects
  r_1 = scale_r_cor(z_1, sd_1, L_1);
  r_1_1 = r_1[, 1];
  r_1_2 = r_1[, 2];
  r_1_3 = r_1[, 3];
  r_1_4 = r_1[, 4];
  r_1_5 = r_1[, 5];
  // compute actual group-level effects
  r_2 = scale_r_cor(z_2, sd_2, L_2);
  r_2_zi_1 = r_2[, 1];
  r_2_zi_2 = r_2[, 2];
  r_2_zi_3 = r_2[, 3];
  r_2_zi_4 = r_2[, 4];
  r_2_zi_5 = r_2[, 5];
  vector[N] mu = X * b;
    // initialize linear predictor term
    vector[N] zi = Intercept_zi + Xc_zi * b_zi;
    for (n in 1:N) {
      // add more terms to the linear predictor
      mu[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n] + r_1_4[J_1[n]] * Z_1_4[n] + r_1_5[J_1[n]] * Z_1_5[n];
    }
    for (n in 1:N) {
      // add more terms to the linear predictor
      zi[n] += r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n] + r_2_zi_4[J_2[n]] * Z_2_zi_4[n] + r_2_zi_5[J_2[n]] * Z_2_zi_5[n];
    }
}

Then I tried calculating the predictions using draws of those parameters:

Preds <- 
        spread_draws(Ideal_Fit, c(mu, zi)[rownum], ndraws = 250) %>% 
        group_by(rownum) %>%
        mutate(mu = plogis(mu),
               zi = plogis(zi),
               theta = runif(250, 0,1)) %>%
        ungroup() %>%
        mutate(Pred = ifelse(theta < zi, 0, rbinom(1, size = 1, prob = mu)))

But still the predictions are way off. Whereas if I fit the same model with brms and use add_predicted_draws, the predictions are much, much better.

JLC · December 5, 2021, 8:27pm

I’m at a loss.

I tried first fitting everything in brms then taking the stancode and standata from the fitted model and putting those back in to an empty brms model:

Mod$fit <- read_stan_csv(cmd_Fit$output_files())

Mod <- rename_pars(Mod)

Then using add_predicted_draws from tidybayes on both the brms model and the one with the outside fit inserted in to the brms object. The results are completely different.

I’m not sure what is happening. Up next I’m trying the rstan backend to see if that gives me reproducible results.

JLC · December 6, 2021, 1:14am

The problem persists with rstan

ajnafa · December 6, 2021, 7:30am

It looks like you’re subsetting draws from the posterior without setting the rng seed to a constant value. You either need to set the seed or use all of the posterior draws from the model otherwise you’ll get different predictions every time you run the code.

Also, be sure to set the seed argument in your call to brm or cmdstanr::model$sample to ensure reproducibility of the model itself.

JLC · December 6, 2021, 4:16pm

Here is a reprex:

Data.csv (38.0 KB)

# List of needed packages
Pkgs <- c("tidyverse",  "parallel", "cmdstanr",
          "brms", "tidybayes", "tidytext")

# Load packages
lapply(Pkgs, require, c = T)

## Set computing options
ncores = detectCores()

# Read in data
Data <- read.csv(file = here("data", "Data.csv")) %>%
        mutate(ID = factor(ID),
               Group = factor(Group),
               Condition = factor(Condition),
               Type = factor(Type))

brms model

## brms zero inflated binomial formula 
zi_fmla <- bf(Response | trials(1) ~ 1 + 
                            Group + 
                            Condition +
                            Type + 
                            Condition:Type +
                            (1 + Condition + Type | ID),
                    center = FALSE,
                    zi ~ Type + (1 + Condition + Type | ID)
)

## Set priors
zi_priors <- c(
        set_prior("normal(0,1)",
                  class = "b",
                  coef = "Intercept"),
        
        set_prior("normal(0,1)",
                  class = "b"),
        
        set_prior("normal(0,1)",
                  class = "sd")
)

# Fit model in brms
zi_Mod <- brm(zi_fmla,
              Data, 
              family = zero_inflated_binomial(),
              prior = zi_priors, 
              inits = "random", 
              iter = 2000, 
              warmup = 1000, 
              chains = 4,
              cores = ncores, 
              backend = "cmdstan",
              normalize = FALSE,
              control = list(adapt_delta = 0.99,
                             max_treedepth = 14)
)

Look at accuracy

Data %>%
        add_predicted_draws(zi_Mod,
                            value = "Pred",
                            re_formula = NULL) %>%
        ungroup() %>%
        mutate(Correct = ifelse(Response == Pred, 1, 0)) %>%
        group_by(ID, Group, Type, .draw) %>%
        summarize(Accuracy = sum(Correct)/n()) %>%
        ungroup() %>%
        group_by(ID, Group, Type) %>%
        
        point_interval(Accuracy,
                       .width = 0.89,
                       .point = median,
                       .interval = hdci,
                       .simple_names = TRUE,
                       na.rm = TRUE) %>%
        ungroup() %>%
        
        ggplot(aes(reorder_within(ID, -Accuracy, Group), Accuracy)) +
        facet_grid(Group ~ Type, scales = "free_x") +
        geom_point(position = position_dodge(width = 0.5)) + 
        geom_errorbar(aes(
                ymin = .lower,
                ymax = .upper),
                width = 0.2,
                position = position_dodge(width = 0.5)) +
        scale_x_reordered("ID") +
        scale_y_continuous("Accuracy", labels = scales::percent_format(accuracy = 1)) +
        theme_bw() +
        theme_bw() +
        theme(
                axis.text.x = element_text(angle = 60, hjust = 1, size = 10),
                axis.text.y = element_text(size = 10),
                axis.title.x = element_text(size = 12),
                axis.title.y = element_text(size = 12),
                plot.title = element_text(hjust = 0.5),
                panel.border = element_rect(fill = NA, colour = "black"),
                panel.grid.major = element_blank(),
                panel.grid.minor = element_blank(),
                axis.line = element_line(colour = "black"),
                strip.text.x = element_text(size = 12),
                strip.text.y = element_text(size = 12),
                strip.background = element_blank(),
                legend.title = element_blank(),
                legend.text = element_text(size = 10),
                legend.position = c(0.95, 0.05),
                legend.key = element_rect(fill = "transparent"),
                legend.direction = "horizontal",
                legend.background = element_rect(fill = "transparent", size = 6) 
        )

Now fit the model with Stan code generated by stancode(zi_Mod)

# Model code 
Stan_Code <- "functions {
 /* compute correlated group-level effects
  * Args: 
  *   z: matrix of unscaled group-level effects
  *   SD: vector of standard deviation parameters
  *   L: cholesky factor correlation matrix
  * Returns: 
  *   matrix of scaled group-level effects
  */ 
  matrix scale_r_cor(matrix z, vector SD, matrix L) {
    // r is stored in another dimension order than z
    return transpose(diag_pre_multiply(SD, L) * z);
  }
  /* zero-inflated binomial log-PDF of a single response 
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   theta: probability parameter of the binomial part
   *   zi: zero-inflation probability
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_lpmf(int y, int trials, 
                                   real theta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_lpmf(1 | zi), 
                         bernoulli_lpmf(0 | zi) + 
                         binomial_lpmf(0 | trials, theta)); 
    } else { 
      return bernoulli_lpmf(0 | zi) +  
             binomial_lpmf(y | trials, theta); 
    } 
  }
  /* zero-inflated binomial log-PDF of a single response 
   * logit parameterization of the zero-inflation part
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   theta: probability parameter of the binomial part 
   *   zi: linear predictor for zero-inflation part 
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_logit_lpmf(int y, int trials, 
                                         real theta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_logit_lpmf(1 | zi), 
                         bernoulli_logit_lpmf(0 | zi) + 
                         binomial_lpmf(0 | trials, theta)); 
    } else { 
      return bernoulli_logit_lpmf(0 | zi) +  
             binomial_lpmf(y | trials, theta); 
    } 
  }
  /* zero-inflated binomial log-PDF of a single response 
   * logit parameterization of the binomial part
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   eta: linear predictor for binomial part 
   *   zi: zero-inflation probability
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_blogit_lpmf(int y, int trials, 
                                          real eta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_lpmf(1 | zi), 
                         bernoulli_lpmf(0 | zi) + 
                         binomial_logit_lpmf(0 | trials, eta)); 
    } else { 
      return bernoulli_lpmf(0 | zi) +  
             binomial_logit_lpmf(y | trials, eta); 
    } 
  }
  /* zero-inflated binomial log-PDF of a single response 
   * logit parameterization of the binomial part
   * logit parameterization of the zero-inflation part
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   eta: linear predictor for binomial part 
   *   zi: linear predictor for zero-inflation part 
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_blogit_logit_lpmf(int y, int trials, 
                                                real eta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_logit_lpmf(1 | zi), 
                         bernoulli_logit_lpmf(0 | zi) + 
                         binomial_logit_lpmf(0 | trials, eta)); 
    } else { 
      return bernoulli_logit_lpmf(0 | zi) +  
             binomial_logit_lpmf(y | trials, eta); 
    } 
  }
  // zero-inflated binomial log-CCDF and log-CDF functions
  real zero_inflated_binomial_lccdf(int y, int trials, real theta, real zi) { 
    return bernoulli_lpmf(0 | zi) + binomial_lccdf(y | trials, theta); 
  }
  real zero_inflated_binomial_lcdf(int y, int trials, real theta, real zi) { 
    return log1m_exp(zero_inflated_binomial_lccdf(y | trials, theta, zi));
  }
}
data {
  int<lower=1> N;  // total number of observations
  int Y[N];  // response variable
  int trials[N];  // number of trials
  int<lower=1> K;  // number of population-level effects
  matrix[N, K] X;  // population-level design matrix
  int<lower=1> K_zi;  // number of population-level effects
  matrix[N, K_zi] X_zi;  // population-level design matrix
  // data for group-level effects of ID 1
  int<lower=1> N_1;  // number of grouping levels
  int<lower=1> M_1;  // number of coefficients per level
  int<lower=1> J_1[N];  // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_1_1;
  vector[N] Z_1_2;
  vector[N] Z_1_3;
  int<lower=1> NC_1;  // number of group-level correlations
  // data for group-level effects of ID 2
  int<lower=1> N_2;  // number of grouping levels
  int<lower=1> M_2;  // number of coefficients per level
  int<lower=1> J_2[N];  // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_2_zi_1;
  vector[N] Z_2_zi_2;
  vector[N] Z_2_zi_3;
  int<lower=1> NC_2;  // number of group-level correlations
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  int Kc_zi = K_zi - 1;
  matrix[N, Kc_zi] Xc_zi;  // centered version of X_zi without an intercept
  vector[Kc_zi] means_X_zi;  // column means of X_zi before centering
  for (i in 2:K_zi) {
    means_X_zi[i - 1] = mean(X_zi[, i]);
    Xc_zi[, i - 1] = X_zi[, i] - means_X_zi[i - 1];
  }
}
parameters {
  vector[K] b;  // population-level effects
  vector[Kc_zi] b_zi;  // population-level effects
  real Intercept_zi;  // temporary intercept for centered predictors
  vector<lower=0>[M_1] sd_1;  // group-level standard deviations
  matrix[M_1, N_1] z_1;  // standardized group-level effects
  cholesky_factor_corr[M_1] L_1;  // cholesky factor of correlation matrix
  vector<lower=0>[M_2] sd_2;  // group-level standard deviations
  matrix[M_2, N_2] z_2;  // standardized group-level effects
  cholesky_factor_corr[M_2] L_2;  // cholesky factor of correlation matrix
}
transformed parameters {
  matrix[N_1, M_1] r_1;  // actual group-level effects
  // using vectors speeds up indexing in loops
  vector[N_1] r_1_1;
  vector[N_1] r_1_2;
  vector[N_1] r_1_3;
  matrix[N_2, M_2] r_2;  // actual group-level effects
  // using vectors speeds up indexing in loops
  vector[N_2] r_2_zi_1;
  vector[N_2] r_2_zi_2;
  vector[N_2] r_2_zi_3;
  // compute actual group-level effects
  r_1 = scale_r_cor(z_1, sd_1, L_1);
  r_1_1 = r_1[, 1];
  r_1_2 = r_1[, 2];
  r_1_3 = r_1[, 3];
  // compute actual group-level effects
  r_2 = scale_r_cor(z_2, sd_2, L_2);
  r_2_zi_1 = r_2[, 1];
  r_2_zi_2 = r_2[, 2];
  r_2_zi_3 = r_2[, 3];
}
model {
  // likelihood not including constants
  if (!prior_only) {
    // initialize linear predictor term
    vector[N] mu = X * b;
    // initialize linear predictor term
    vector[N] zi = Intercept_zi + Xc_zi * b_zi;
    for (n in 1:N) {
      // add more terms to the linear predictor
      mu[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n];
    }
    for (n in 1:N) {
      // add more terms to the linear predictor
      zi[n] += r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n];
    }
    for (n in 1:N) {
      target += zero_inflated_binomial_blogit_logit_lpmf(Y[n] | trials[n], mu[n], zi[n]);
    }
  }
  // priors not including constants
  target += normal_lupdf(b[1] | 0,1);
  target += normal_lupdf(b[2] | 0,1);
  target += normal_lupdf(b[3] | 0,1);
  target += normal_lupdf(b[4] | 0,1);
  target += normal_lupdf(b[5] | 0,1);
  target += logistic_lupdf(Intercept_zi | 0, 1);
  target += normal_lupdf(sd_1 | 0,1);
  target += std_normal_lupdf(to_vector(z_1));
  target += lkj_corr_cholesky_lupdf(L_1 | 1);
  target += student_t_lupdf(sd_2 | 3, 0, 2.5);
  target += std_normal_lupdf(to_vector(z_2));
  target += lkj_corr_cholesky_lupdf(L_2 | 1);
}
generated quantities {
  // actual population-level intercept
  real b_zi_Intercept = Intercept_zi - dot_product(means_X_zi, b_zi);
  // compute group-level correlations
  corr_matrix[M_1] Cor_1 = multiply_lower_tri_self_transpose(L_1);
  vector<lower=-1,upper=1>[NC_1] cor_1;
  // compute group-level correlations
  corr_matrix[M_2] Cor_2 = multiply_lower_tri_self_transpose(L_2);
  vector<lower=-1,upper=1>[NC_2] cor_2;
  // extract upper diagonal of correlation matrix
  for (k in 1:M_1) {
    for (j in 1:(k - 1)) {
      cor_1[choose(k - 1, 2) + j] = Cor_1[j, k];
    }
  }
  // extract upper diagonal of correlation matrix
  for (k in 1:M_2) {
    for (j in 1:(k - 1)) {
      cor_2[choose(k - 1, 2) + j] = Cor_2[j, k];
    }
  }
}
"

Compile and fit

# Compile cmdstan program
Cmd_Mod <- cmdstan_model(stan_file = write_stan_file(Stan_Code), 
                         compile = TRUE)


## Sample from model 
Fit <-Cmd_Mod$sample(
        data = standata(zi_Mod),       
        iter_warmup = 1000,            
        iter_sampling = 2000,         
        chains = 4,  
        max_treedepth = 14, 
        adapt_delta = 0.99)

Fit an empty `brms` model

empty_brms_Mod <- brm(bf(Response | trials(1) ~ 1 + 
                                 Group + 
                                 Condition +
                                 Type + 
                                 Condition:Type +
                                 (1 + Condition + Type | ID),
                         center = FALSE,
                         zi ~ Type + (1 + Condition + Type | ID)),
                       Data,
                       family = zero_inflated_binomial(),
                       iter = 2000,
                       warmup = 1000,
                       chains = 4,
                       cores = ncores,
                       empty = TRUE,
                       backend = "cmdstan",
                       normalize = FALSE,
                       control = list(adapt_delta = 0.99,
                                      max_treedepth = 14)
)

Replace the fit slot with the rstan fit

empty_brms_Mod$fit <- read_stan_csv(Fit$output_files())

empty_brms_Mod <- rename_pars(empty_brms_Mod)

Look at accuracy

Data %>%
        add_predicted_draws(empty_Stan_Mod,
                            value = "Pred",
                            re_formula = NULL) %>%
        ungroup() %>%
        mutate(Correct = ifelse(Response == Pred, 1, 0)) %>%
        group_by(ID, Group, Type, .draw) %>%
        summarize(Accuracy = sum(Correct)/n()) %>%
        ungroup() %>%
        group_by(ID, Group, Type) %>%
        
        point_interval(Accuracy,
                       .width = 0.89,
                       .point = median,
                       .interval = hdci,
                       .simple_names = TRUE,
                       na.rm = TRUE) %>%
        ungroup() %>%
        
        ggplot(aes(reorder_within(ID, -Accuracy, Group), Accuracy)) +
        facet_grid(Group ~ Type, scales = "free_x") +
        geom_point(position = position_dodge(width = 0.5)) + 
        geom_errorbar(aes(
                ymin = .lower,
                ymax = .upper),
                width = 0.2,
                position = position_dodge(width = 0.5)) +
        scale_x_reordered("ID") +
        scale_y_continuous("Accuracy", labels = scales::percent_format(accuracy = 1)) +
        theme_bw() +
        theme_bw() +
        theme(
                axis.text.x = element_text(angle = 60, hjust = 1, size = 10),
                axis.text.y = element_text(size = 10),
                axis.title.x = element_text(size = 12),
                axis.title.y = element_text(size = 12),
                plot.title = element_text(hjust = 0.5),
                panel.border = element_rect(fill = NA, colour = "black"),
                panel.grid.major = element_blank(),
                panel.grid.minor = element_blank(),
                axis.line = element_line(colour = "black"),
                strip.text.x = element_text(size = 12),
                strip.text.y = element_text(size = 12),
                strip.background = element_blank(),
                legend.title = element_blank(),
                legend.text = element_text(size = 10),
                legend.position = c(0.95, 0.05),
                legend.key = element_rect(fill = "transparent"),
                legend.direction = "horizontal",
                legend.background = element_rect(fill = "transparent", size = 6) 
        )

I’d like to use generated quantities rather than fitting an empty model and replacing the fit slot.

JLC · December 6, 2021, 4:41pm

When I try with generated quantities, I get a very different set of predictions:

## Model code 
Stan_Code <- "functions {
 /* compute correlated group-level effects
  * Args: 
  *   z: matrix of unscaled group-level effects
  *   SD: vector of standard deviation parameters
  *   L: cholesky factor correlation matrix
  * Returns: 
  *   matrix of scaled group-level effects
  */ 
  matrix scale_r_cor(matrix z, vector SD, matrix L) {
    // r is stored in another dimension order than z
    return transpose(diag_pre_multiply(SD, L) * z);
  }
  /* zero-inflated binomial log-PDF of a single response 
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   theta: probability parameter of the binomial part
   *   zi: zero-inflation probability
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_lpmf(int y, int trials, 
                                   real theta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_lpmf(1 | zi), 
                         bernoulli_lpmf(0 | zi) + 
                         binomial_lpmf(0 | trials, theta)); 
    } else { 
      return bernoulli_lpmf(0 | zi) +  
             binomial_lpmf(y | trials, theta); 
    } 
  }
  /* zero-inflated binomial log-PDF of a single response 
   * logit parameterization of the zero-inflation part
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   theta: probability parameter of the binomial part 
   *   zi: linear predictor for zero-inflation part 
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_logit_lpmf(int y, int trials, 
                                         real theta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_logit_lpmf(1 | zi), 
                         bernoulli_logit_lpmf(0 | zi) + 
                         binomial_lpmf(0 | trials, theta)); 
    } else { 
      return bernoulli_logit_lpmf(0 | zi) +  
             binomial_lpmf(y | trials, theta); 
    } 
  }
  /* zero-inflated binomial log-PDF of a single response 
   * logit parameterization of the binomial part
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   eta: linear predictor for binomial part 
   *   zi: zero-inflation probability
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_blogit_lpmf(int y, int trials, 
                                          real eta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_lpmf(1 | zi), 
                         bernoulli_lpmf(0 | zi) + 
                         binomial_logit_lpmf(0 | trials, eta)); 
    } else { 
      return bernoulli_lpmf(0 | zi) +  
             binomial_logit_lpmf(y | trials, eta); 
    } 
  }
  /* zero-inflated binomial log-PDF of a single response 
   * logit parameterization of the binomial part
   * logit parameterization of the zero-inflation part
   * Args: 
   *   y: the response value 
   *   trials: number of trials of the binomial part
   *   eta: linear predictor for binomial part 
   *   zi: linear predictor for zero-inflation part 
   * Returns:  
   *   a scalar to be added to the log posterior 
   */ 
  real zero_inflated_binomial_blogit_logit_lpmf(int y, int trials, 
                                                real eta, real zi) {
    if (y == 0) { 
      return log_sum_exp(bernoulli_logit_lpmf(1 | zi), 
                         bernoulli_logit_lpmf(0 | zi) + 
                         binomial_logit_lpmf(0 | trials, eta)); 
    } else { 
      return bernoulli_logit_lpmf(0 | zi) +  
             binomial_logit_lpmf(y | trials, eta); 
    } 
  }
  // zero-inflated binomial log-CCDF and log-CDF functions
  real zero_inflated_binomial_lccdf(int y, int trials, real theta, real zi) { 
    return bernoulli_lpmf(0 | zi) + binomial_lccdf(y | trials, theta); 
  }
  real zero_inflated_binomial_lcdf(int y, int trials, real theta, real zi) { 
    return log1m_exp(zero_inflated_binomial_lccdf(y | trials, theta, zi));
  }
}
data {
  int<lower=1> N;  // total number of observations
  int Y[N];  // response variable
  int trials[N];  // number of trials
  int<lower=1> K;  // number of population-level effects
  matrix[N, K] X;  // population-level design matrix
  int<lower=1> K_zi;  // number of population-level effects
  matrix[N, K_zi] X_zi;  // population-level design matrix
  // data for group-level effects of ID 1
  int<lower=1> N_1;  // number of grouping levels
  int<lower=1> M_1;  // number of coefficients per level
  int<lower=1> J_1[N];  // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_1_1;
  vector[N] Z_1_2;
  vector[N] Z_1_3;
  int<lower=1> NC_1;  // number of group-level correlations
  // data for group-level effects of ID 2
  int<lower=1> N_2;  // number of grouping levels
  int<lower=1> M_2;  // number of coefficients per level
  int<lower=1> J_2[N];  // grouping indicator per observation
  // group-level predictor values
  vector[N] Z_2_zi_1;
  vector[N] Z_2_zi_2;
  vector[N] Z_2_zi_3;
  int<lower=1> NC_2;  // number of group-level correlations
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  int Kc_zi = K_zi - 1;
  matrix[N, Kc_zi] Xc_zi;  // centered version of X_zi without an intercept
  vector[Kc_zi] means_X_zi;  // column means of X_zi before centering
  for (i in 2:K_zi) {
    means_X_zi[i - 1] = mean(X_zi[, i]);
    Xc_zi[, i - 1] = X_zi[, i] - means_X_zi[i - 1];
  }
}
parameters {
  vector[K] b;  // population-level effects
  vector[Kc_zi] b_zi;  // population-level effects
  real Intercept_zi;  // temporary intercept for centered predictors
  vector<lower=0>[M_1] sd_1;  // group-level standard deviations
  matrix[M_1, N_1] z_1;  // standardized group-level effects
  cholesky_factor_corr[M_1] L_1;  // cholesky factor of correlation matrix
  vector<lower=0>[M_2] sd_2;  // group-level standard deviations
  matrix[M_2, N_2] z_2;  // standardized group-level effects
  cholesky_factor_corr[M_2] L_2;  // cholesky factor of correlation matrix
}
transformed parameters {
  matrix[N_1, M_1] r_1;  // actual group-level effects
  // using vectors speeds up indexing in loops
  vector[N_1] r_1_1;
  vector[N_1] r_1_2;
  vector[N_1] r_1_3;
  matrix[N_2, M_2] r_2;  // actual group-level effects
  // using vectors speeds up indexing in loops
  vector[N_2] r_2_zi_1;
  vector[N_2] r_2_zi_2;
  vector[N_2] r_2_zi_3;
  // compute actual group-level effects
  r_1 = scale_r_cor(z_1, sd_1, L_1);
  r_1_1 = r_1[, 1];
  r_1_2 = r_1[, 2];
  r_1_3 = r_1[, 3];
  // compute actual group-level effects
  r_2 = scale_r_cor(z_2, sd_2, L_2);
  r_2_zi_1 = r_2[, 1];
  r_2_zi_2 = r_2[, 2];
  r_2_zi_3 = r_2[, 3];
}
model {
  // likelihood not including constants
  if (!prior_only) {
    // initialize linear predictor term
    vector[N] mu = X * b;
    // initialize linear predictor term
    vector[N] zi = Intercept_zi + Xc_zi * b_zi;
    for (n in 1:N) {
      // add more terms to the linear predictor
      mu[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n];
    }
    for (n in 1:N) {
      // add more terms to the linear predictor
      zi[n] += r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n];
    }
    for (n in 1:N) {
      target += zero_inflated_binomial_blogit_logit_lpmf(Y[n] | trials[n], mu[n], zi[n]);
    }
  }
  // priors not including constants
  target += normal_lupdf(b[1] | 0,1);
  target += normal_lupdf(b[2] | 0,1);
  target += normal_lupdf(b[3] | 0,1);
  target += normal_lupdf(b[4] | 0,1);
  target += normal_lupdf(b[5] | 0,1);
  target += logistic_lupdf(Intercept_zi | 0, 1);
  target += normal_lupdf(sd_1 | 0,1);
  target += std_normal_lupdf(to_vector(z_1));
  target += lkj_corr_cholesky_lupdf(L_1 | 1);
  target += student_t_lupdf(sd_2 | 3, 0, 2.5);
  target += std_normal_lupdf(to_vector(z_2));
  target += lkj_corr_cholesky_lupdf(L_2 | 1);
}
generated quantities {
   vector[N] mu = X * b;
   vector[N] zi = Intercept_zi + Xc_zi * b_zi;
   vector[N] mu_pred;
   vector[N] zi_pred;
   vector[N] Y_pred; 
   vector[N] theta; 
   theta[N] = uniform_rng(0,1); 
  // actual population-level intercept
  real b_zi_Intercept = Intercept_zi - dot_product(means_X_zi, b_zi);
  // compute group-level correlations
  corr_matrix[M_1] Cor_1 = multiply_lower_tri_self_transpose(L_1);
  vector<lower=-1,upper=1>[NC_1] cor_1;
  // compute group-level correlations
  corr_matrix[M_2] Cor_2 = multiply_lower_tri_self_transpose(L_2);
  vector<lower=-1,upper=1>[NC_2] cor_2;
  // extract upper diagonal of correlation matrix
  for (k in 1:M_1) {
    for (j in 1:(k - 1)) {
      cor_1[choose(k - 1, 2) + j] = Cor_1[j, k];
    }
  }
  // extract upper diagonal of correlation matrix
  for (k in 1:M_2) {
    for (j in 1:(k - 1)) {
      cor_2[choose(k - 1, 2) + j] = Cor_2[j, k];
    }
  }
  for (n in 1:N) {
      mu_pred[n] = mu[n] + r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n];
      
      zi_pred[n] = zi[n] +r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n];
   if (theta[n] < inv_logit(zi_pred[n])) Y_pred[n] = 0;
     else
    Y_pred[n] = binomial_rng(trials[n], inv_logit(mu_pred[n]));
    }
}
"

Compile and fit

# Compile stan program 
Cmd_Mod <- cmdstan_model(stan_file = write_stan_file(Stan_Code), 
                         compile = TRUE)

# Sample from compiled model 
Fit <-Cmd_Mod$sample(
        data = standata(zi_Mod),       
        iter_warmup = 1000,            
        iter_sampling = 2000,         
        chains = 4,  
        max_treedepth = 14, 
        adapt_delta = 0.99)

Look at accuracy

# Get samples of Y_pred
pred_samples <- draws_of(as_draws_rvars(Fit$draws())$Y_pred)

# Plot
cbind(Data, as_tibble(t(pred_samples))) %>%
        pivot_longer(`1`:`8000`, names_to = ".draw", values_to = "Pred") %>% 
        mutate(.draw = as.numeric(.draw)) %>%
        ungroup() %>%
        mutate(Correct = ifelse(Response == Pred, 1, 0)) %>%
        group_by(ID, Group, Type, .draw) %>%
        summarize(Accuracy = sum(Correct)/n()) %>%
        ungroup() %>%
        group_by(ID, Group, Type) %>%
        
        point_interval(Accuracy,
                       .width = 0.89,
                       .point = median,
                       .interval = hdci,
                       .simple_names = TRUE,
                       na.rm = TRUE) %>%
        ungroup() %>%
        
        ggplot(aes(ID, Accuracy)) +
        facet_grid(Group ~ Type, scales = "free_x") +
        geom_point(position = position_dodge(width = 0.5)) + 
        geom_errorbar(aes(
                ymin = .lower,
                ymax = .upper),
                width = 0.2,
                position = position_dodge(width = 0.5)) +
        scale_x_discrete("ID") +
        scale_y_continuous("Accuracy", labels = scales::percent_format(accuracy = 1)) +
        theme_bw() +
        theme_bw() +
        theme(
                axis.text.x = element_text(angle = 60, hjust = 1, size = 10),
                axis.text.y = element_text(size = 10),
                axis.title.x = element_text(size = 12),
                axis.title.y = element_text(size = 12),
                plot.title = element_text(hjust = 0.5),
                panel.border = element_rect(fill = NA, colour = "black"),
                panel.grid.major = element_blank(),
                panel.grid.minor = element_blank(),
                axis.line = element_line(colour = "black"),
                strip.text.x = element_text(size = 12),
                strip.text.y = element_text(size = 12),
                strip.background = element_blank(),
                legend.title = element_blank(),
                legend.text = element_text(size = 10),
                legend.position = c(0.95, 0.05),
                legend.key = element_rect(fill = "transparent"),
                legend.direction = "horizontal",
                legend.background = element_rect(fill = "transparent", size = 6) 
        )

The right-hand facet looks comparable, but the left-hand facet is way off.

Is the problem in how I’m calculating the predicted responses?

JLC · December 6, 2021, 6:29pm

The problem was in how I was creating theta in generated quantities.

generated quantities {
   vector[N] mu = X * b;
   vector[N] zi = Intercept_zi + Xc_zi * b_zi;
   vector[N] mu_pred;
   vector[N] zi_pred;
   vector[N] Y_pred; 
   vector[N] theta; 
  // theta[N] = uniform_rng(0,1); 
  // actual population-level intercept
  real b_zi_Intercept = Intercept_zi - dot_product(means_X_zi, b_zi);
  // compute group-level correlations
  corr_matrix[M_1] Cor_1 = multiply_lower_tri_self_transpose(L_1);
  vector<lower=-1,upper=1>[NC_1] cor_1;
  // compute group-level correlations
  corr_matrix[M_2] Cor_2 = multiply_lower_tri_self_transpose(L_2);
  vector<lower=-1,upper=1>[NC_2] cor_2;
  // extract upper diagonal of correlation matrix
  for (k in 1:M_1) {
    for (j in 1:(k - 1)) {
      cor_1[choose(k - 1, 2) + j] = Cor_1[j, k];
    }
  }
  // extract upper diagonal of correlation matrix
  for (k in 1:M_2) {
    for (j in 1:(k - 1)) {
      cor_2[choose(k - 1, 2) + j] = Cor_2[j, k];
    }
  }
  for (n in 1:N) {
      theta[n] = uniform_rng(0,1);
      mu_pred[n] = mu[n] + r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n];
      zi_pred[n] = zi[n] + r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n];
      
   if (theta[n] < inv_logit(zi_pred[n])) Y_pred[n] = 0;
     else
    Y_pred[n] = binomial_rng(trials[n], inv_logit(mu_pred[n]));
    }
}

Topic		Replies	Views
Mathematical Notation for a zero inflated negative binomial model in brms Modeling math	13	1998	March 26, 2021
Model specification of a zero-inflated negative binomial random effect model using brms brms specification	4	854	July 2, 2021
Predicting from a zero inflated negative binomial model Modeling specification	2	833	May 23, 2023
Question on betabinomial family Modeling brms	25	802	September 6, 2023
Zero inflated beta binomial model under brms brms	1	1616	February 19, 2020

Help predicting from zero-inflated binomial

Fit an empty brms model

Related topics

Fit an empty `brms` model