# Help predicting from zero-inflated binomial

I am working on a zero-inflated binomial model with tremendous assistance from `brms`. My data are single trial response vs no-response, but there are a lot of zeros.

I’m trying to fit the model within `cmdstanr`, which is fine as-is, but I would also like to do some posterior predictive checks.

`brms` uses some custom functions for this model

``````/* zero-inflated binomial log-PDF of a single response
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   theta: probability parameter of the binomial part
*   zi: zero-inflation probability
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_lpmf(int y, int trials,
real theta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_lpmf(1 | zi),
bernoulli_lpmf(0 | zi) +
binomial_lpmf(0 | trials, theta));
} else {
return bernoulli_lpmf(0 | zi) +
binomial_lpmf(y | trials, theta);
}
}
/* zero-inflated binomial log-PDF of a single response
* logit parameterization of the zero-inflation part
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   theta: probability parameter of the binomial part
*   zi: linear predictor for zero-inflation part
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_logit_lpmf(int y, int trials,
real theta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_logit_lpmf(1 | zi),
bernoulli_logit_lpmf(0 | zi) +
binomial_lpmf(0 | trials, theta));
} else {
return bernoulli_logit_lpmf(0 | zi) +
binomial_lpmf(y | trials, theta);
}
}
/* zero-inflated binomial log-PDF of a single response
* logit parameterization of the binomial part
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   eta: linear predictor for binomial part
*   zi: zero-inflation probability
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_blogit_lpmf(int y, int trials,
real eta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_lpmf(1 | zi),
bernoulli_lpmf(0 | zi) +
binomial_logit_lpmf(0 | trials, eta));
} else {
return bernoulli_lpmf(0 | zi) +
binomial_logit_lpmf(y | trials, eta);
}
}
/* zero-inflated binomial log-PDF of a single response
* logit parameterization of the binomial part
* logit parameterization of the zero-inflation part
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   eta: linear predictor for binomial part
*   zi: linear predictor for zero-inflation part
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_blogit_logit_lpmf(int y, int trials,
real eta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_logit_lpmf(1 | zi),
bernoulli_logit_lpmf(0 | zi) +
binomial_logit_lpmf(0 | trials, eta));
} else {
return bernoulli_logit_lpmf(0 | zi) +
binomial_logit_lpmf(y | trials, eta);
}
}
// zero-inflated binomial log-CCDF and log-CDF functions
real zero_inflated_binomial_lccdf(int y, int trials, real theta, real zi) {
return bernoulli_lpmf(0 | zi) + binomial_lccdf(y | trials, theta);
}
real zero_inflated_binomial_lcdf(int y, int trials, real theta, real zi) {
return log1m_exp(zero_inflated_binomial_lccdf(y | trials, theta, zi));
}

``````

I want to add some predictions into the generated quantities block, so I’m trying to define a `zero_inflated_binomial_blogit_rng` function, but what I’m doing doesn’t seem to be working.

`````` real zero_inflated_binomial_blogit_rng(int y, int trials,
real eta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_logit_lpmf(1 | zi),
bernoulli_logit_lpmf(0 | zi) +
binomial_logit_lpmf(0 | trials, eta));
}
if (zi < 0 ) {
return bernoulli_rng(0) +
binomial_rng(trials, inv_logit(eta));
}
if (zi > 1 ) {
return bernoulli_rng(1) +
binomial_rng(trials, inv_logit(eta));
}
else {
return bernoulli_rng(zi) +
binomial_rng(trials, inv_logit(eta));
}
}
``````

And this to generated quantities:

``````for (n in 1:N) {
if (zi_pred[n] < 0) Y_pred[n] = zero_inflated_binomial_blogit_rng(Y[n], trials[n], mu_pred[n], 0);
if (zi_pred[n] > 1) Y_pred[n] = zero_inflated_binomial_blogit_rng(Y[n], trials[n], mu_pred[n], 1);
else
Y_pred[n] = zero_inflated_binomial_blogit_rng(Y[n], trials[n], mu_pred[n], zi_pred[n]);
}

``````

But as defined in the function, this is giving me reals rather than binary predictions.

Can anyone suggest how I can create something to give me predicted 1’s and 0’s?

I looked through how `brms` handles predictions for these models and found this:

``````posterior_predict_zero_inflated_binomial <- function(i, prep, ...) {
# theta is the bernoulii zero-inflation parameter
theta <- get_dpar(prep, "zi", i = i)
trials <- prep\$data\$trials[i]
prob <- get_dpar(prep, "mu", i = i)
ndraws <- prep\$ndraws
# compare with theta to incorporate the zero-inflation process
zi <- runif(ndraws, 0, 1)
ifelse(zi < theta, 0, rbinom(ndraws, size = trials, prob = prob))
}
``````

So I added the following to generated quantities:

``````  theta[N] = uniform_rng(0,1);
for (n in 1:N) {
// add more terms to the linear predictor
mu_pred[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n] + r_1_4[J_1[n]] * Z_1_4[n] + r_1_5[J_1[n]] * Z_1_5[n];
// add more terms to the linear predictor
zi_pred[n] += r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n] + r_2_zi_4[J_2[n]] * Z_2_zi_4[n] + r_2_zi_5[J_2[n]] * Z_2_zi_5[n];
if (theta[n] < zi_pred[n]) Y_pred[n] = 0;
else
Y_pred[n] = binomial_rng(trials[n], inv_logit(mu_pred[n]));
}
``````

But, the predictions are very different from using `posterior_predict`.

Can anyone spot what I’m doing wrong?

Why not just set `backend = "cmdstanr"` in `brm` and use brms’ convivence functions to do the predictions after the fact in R?

I’m trying to get a better understanding of how it’s working by working through the process, so I can make changes to the generated Stan code.

1 Like

I’m processing the predictions two ways using draws from the fitted model

``````pred_samples <- draws_of(as_draws_rvars(Fit\$draws())\$Y_pred)
prob <- draws_of(as_draws_rvars(Fit\$draws())\$prob_pred)
theta <- draws_of(as_draws_rvars(Fit\$draws())\$theta_pred)
``````

I’m looking at Y_Pred directly

or processing the `prob` and `theta` parameters to calculate a prediction

``````Preds <- cbind(as_tibble(t(prob)) %>%
pivot_longer(`1`:`8000`, names_to = ".draw", values_to = "prob") %>%
mutate(.draw = as.numeric(.draw),
prob = plogis(prob)) %>%
ungroup(),
as_tibble(t(theta)) %>%
pivot_longer(`1`:`8000`, names_to = ".draw", values_to = "theta") %>%
mutate(.draw = as.numeric(.draw),
theta = plogis(theta)) %>%
ungroup())
``````
``````Predictions <- Preds  %>%
mutate(zi = runif(nrow(.), 0, 1),
Pred = ifelse(zi < theta , 0, rbinom(1, size = 1, prob = prob)))
``````

@paul.buerkner can you see where I’m going wrong?

I’m using the logit link for both mu and zi

``````functions {
/* compute correlated group-level effects
* Args:
*   z: matrix of unscaled group-level effects
*   SD: vector of standard deviation parameters
*   L: cholesky factor correlation matrix
* Returns:
*   matrix of scaled group-level effects
*/
matrix scale_r_cor(matrix z, vector SD, matrix L) {
// r is stored in another dimension order than z
return transpose(diag_pre_multiply(SD, L) * z);
}
/* zero-inflated binomial log-PDF of a single response
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   theta: probability parameter of the binomial part
*   zi: zero-inflation probability
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_lpmf(int y, int trials,
real theta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_lpmf(1 | zi),
bernoulli_lpmf(0 | zi) +
binomial_lpmf(0 | trials, theta));
} else {
return bernoulli_lpmf(0 | zi) +
binomial_lpmf(y | trials, theta);
}
}
/* zero-inflated binomial log-PDF of a single response
* logit parameterization of the zero-inflation part
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   theta: probability parameter of the binomial part
*   zi: linear predictor for zero-inflation part
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_logit_lpmf(int y, int trials,
real theta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_logit_lpmf(1 | zi),
bernoulli_logit_lpmf(0 | zi) +
binomial_lpmf(0 | trials, theta));
} else {
return bernoulli_logit_lpmf(0 | zi) +
binomial_lpmf(y | trials, theta);
}
}
/* zero-inflated binomial log-PDF of a single response
* logit parameterization of the binomial part
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   eta: linear predictor for binomial part
*   zi: zero-inflation probability
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_blogit_lpmf(int y, int trials,
real eta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_lpmf(1 | zi),
bernoulli_lpmf(0 | zi) +
binomial_logit_lpmf(0 | trials, eta));
} else {
return bernoulli_lpmf(0 | zi) +
binomial_logit_lpmf(y | trials, eta);
}
}
/* zero-inflated binomial log-PDF of a single response
* logit parameterization of the binomial part
* logit parameterization of the zero-inflation part
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   eta: linear predictor for binomial part
*   zi: linear predictor for zero-inflation part
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_blogit_logit_lpmf(int y, int trials,
real eta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_logit_lpmf(1 | zi),
bernoulli_logit_lpmf(0 | zi) +
binomial_logit_lpmf(0 | trials, eta));
} else {
return bernoulli_logit_lpmf(0 | zi) +
binomial_logit_lpmf(y | trials, eta);
}
}
// zero-inflated binomial log-CCDF and log-CDF functions
real zero_inflated_binomial_lccdf(int y, int trials, real theta, real zi) {
return bernoulli_lpmf(0 | zi) + binomial_lccdf(y | trials, theta);
}
real zero_inflated_binomial_lcdf(int y, int trials, real theta, real zi) {
return log1m_exp(zero_inflated_binomial_lccdf(y | trials, theta, zi));
}
}
data {
int<lower=1> N;  // total number of observations
int Y[N];  // response variable
int trials[N];  // number of trials
int<lower=1> K;  // number of population-level effects
matrix[N, K] X;  // population-level design matrix
int<lower=1> K_zi;  // number of population-level effects
matrix[N, K_zi] X_zi;  // population-level design matrix
// data for group-level effects of ID 1
int<lower=1> N_1;  // number of grouping levels
int<lower=1> M_1;  // number of coefficients per level
int<lower=1> J_1[N];  // grouping indicator per observation
// group-level predictor values
vector[N] Z_1_1;
vector[N] Z_1_2;
vector[N] Z_1_3;
vector[N] Z_1_4;
vector[N] Z_1_5;
int<lower=1> NC_1;  // number of group-level correlations
// data for group-level effects of ID 2
int<lower=1> N_2;  // number of grouping levels
int<lower=1> M_2;  // number of coefficients per level
int<lower=1> J_2[N];  // grouping indicator per observation
// group-level predictor values
vector[N] Z_2_zi_1;
vector[N] Z_2_zi_2;
vector[N] Z_2_zi_3;
vector[N] Z_2_zi_4;
vector[N] Z_2_zi_5;
int<lower=1> NC_2;  // number of group-level correlations
int prior_only;  // should the likelihood be ignored?
}
transformed data {
int Kc = K - 1;
matrix[N, Kc] Xc;  // centered version of X without an intercept
vector[Kc] means_X;  // column means of X before centering
int Kc_zi = K_zi - 1;
matrix[N, Kc_zi] Xc_zi;  // centered version of X_zi without an intercept
vector[Kc_zi] means_X_zi;  // column means of X_zi before centering
for (i in 2:K) {
means_X[i - 1] = mean(X[, i]);
Xc[, i - 1] = X[, i] - means_X[i - 1];
}
for (i in 2:K_zi) {
means_X_zi[i - 1] = mean(X_zi[, i]);
Xc_zi[, i - 1] = X_zi[, i] - means_X_zi[i - 1];
}
}
parameters {
vector[Kc] b;  // population-level effects
real Intercept;  // temporary intercept for centered predictors
vector[Kc_zi] b_zi;  // population-level effects
real Intercept_zi;  // temporary intercept for centered predictors
vector<lower=0>[M_1] sd_1;  // group-level standard deviations
matrix[M_1, N_1] z_1;  // standardized group-level effects
cholesky_factor_corr[M_1] L_1;  // cholesky factor of correlation matrix
vector<lower=0>[M_2] sd_2;  // group-level standard deviations
matrix[M_2, N_2] z_2;  // standardized group-level effects
cholesky_factor_corr[M_2] L_2;  // cholesky factor of correlation matrix
}
transformed parameters {
matrix[N_1, M_1] r_1;  // actual group-level effects
// using vectors speeds up indexing in loops
vector[N_1] r_1_1;
vector[N_1] r_1_2;
vector[N_1] r_1_3;
vector[N_1] r_1_4;
vector[N_1] r_1_5;
matrix[N_2, M_2] r_2;  // actual group-level effects
// using vectors speeds up indexing in loops
vector[N_2] r_2_zi_1;
vector[N_2] r_2_zi_2;
vector[N_2] r_2_zi_3;
vector[N_2] r_2_zi_4;
vector[N_2] r_2_zi_5;
// compute actual group-level effects
r_1 = scale_r_cor(z_1, sd_1, L_1);
r_1_1 = r_1[, 1];
r_1_2 = r_1[, 2];
r_1_3 = r_1[, 3];
r_1_4 = r_1[, 4];
r_1_5 = r_1[, 5];
// compute actual group-level effects
r_2 = scale_r_cor(z_2, sd_2, L_2);
r_2_zi_1 = r_2[, 1];
r_2_zi_2 = r_2[, 2];
r_2_zi_3 = r_2[, 3];
r_2_zi_4 = r_2[, 4];
r_2_zi_5 = r_2[, 5];
}
model {
// likelihood not including constants
if (!prior_only) {
// initialize linear predictor term
vector[N] mu = Intercept + Xc * b;
// initialize linear predictor term
vector[N] zi = Intercept_zi + Xc_zi * b_zi;
for (n in 1:N) {
// add more terms to the linear predictor
mu[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n] + r_1_4[J_1[n]] * Z_1_4[n] + r_1_5[J_1[n]] * Z_1_5[n];
}
for (n in 1:N) {
// add more terms to the linear predictor
zi[n] += r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n] + r_2_zi_4[J_2[n]] * Z_2_zi_4[n] + r_2_zi_5[J_2[n]] * Z_2_zi_5[n];
}
for (n in 1:N) {
target += zero_inflated_binomial_blogit_logit_lpmf(Y[n] | trials[n], mu[n], zi[n]);
}
}
// priors not including constants
target += normal_lupdf(b | 0,1);
target += normal_lupdf(Intercept | 0,1);
target += logistic_lupdf(Intercept_zi | 0, 1);
target += exponential_lupdf(sd_1 | 1);
target += std_normal_lupdf(to_vector(z_1));
target += lkj_corr_cholesky_lupdf(L_1 | 1);
target += student_t_lupdf(sd_2 | 3, 0, 2.5);
target += std_normal_lupdf(to_vector(z_2));
target += lkj_corr_cholesky_lupdf(L_2 | 1);
}
generated quantities {
vector[N] prob_pred = Intercept + Xc * b;
vector[N] theta_pred = Intercept_zi + Xc_zi * b_zi;
vector[N] Y_pred;
vector[N] zi;
zi[N] = uniform_rng(0,1);
// actual population-level intercept
real b_Intercept = Intercept - dot_product(means_X, b);
// actual population-level intercept
real b_zi_Intercept = Intercept_zi - dot_product(means_X_zi, b_zi);
// compute group-level correlations
corr_matrix[M_1] Cor_1 = multiply_lower_tri_self_transpose(L_1);
vector<lower=-1,upper=1>[NC_1] cor_1;
// compute group-level correlations
corr_matrix[M_2] Cor_2 = multiply_lower_tri_self_transpose(L_2);
vector<lower=-1,upper=1>[NC_2] cor_2;
// extract upper diagonal of correlation matrix
for (k in 1:M_1) {
for (j in 1:(k - 1)) {
cor_1[choose(k - 1, 2) + j] = Cor_1[j, k];
}
}
// extract upper diagonal of correlation matrix
for (k in 1:M_2) {
for (j in 1:(k - 1)) {
cor_2[choose(k - 1, 2) + j] = Cor_2[j, k];
}
}
for (n in 1:N) {
// add more terms to the linear predictor
prob_pred[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n] + r_1_4[J_1[n]] * Z_1_4[n] + r_1_5[J_1[n]] * Z_1_5[n];
// add more terms to the linear predictor
theta_pred[n] += r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n] + r_2_zi_4[J_2[n]] * Z_2_zi_4[n] + r_2_zi_5[J_2[n]] * Z_2_zi_5[n];
if (zi[n] < theta_pred[n]) Y_pred[n] = 0;
else
Y_pred[n] = binomial_rng(trials[n], inv_logit(prob_pred[n]));
}
}

``````

In my latest attempt, I move the `mu` and `zi` parameters to the transformed parameters block, so I could recover those directly:

``````transformed parameters {
matrix[N_1, M_1] r_1;  // actual group-level effects
// using vectors speeds up indexing in loops
vector[N_1] r_1_1;
vector[N_1] r_1_2;
vector[N_1] r_1_3;
vector[N_1] r_1_4;
vector[N_1] r_1_5;
matrix[N_2, M_2] r_2;  // actual group-level effects
// using vectors speeds up indexing in loops
vector[N_2] r_2_zi_1;
vector[N_2] r_2_zi_2;
vector[N_2] r_2_zi_3;
vector[N_2] r_2_zi_4;
vector[N_2] r_2_zi_5;
// compute actual group-level effects
r_1 = scale_r_cor(z_1, sd_1, L_1);
r_1_1 = r_1[, 1];
r_1_2 = r_1[, 2];
r_1_3 = r_1[, 3];
r_1_4 = r_1[, 4];
r_1_5 = r_1[, 5];
// compute actual group-level effects
r_2 = scale_r_cor(z_2, sd_2, L_2);
r_2_zi_1 = r_2[, 1];
r_2_zi_2 = r_2[, 2];
r_2_zi_3 = r_2[, 3];
r_2_zi_4 = r_2[, 4];
r_2_zi_5 = r_2[, 5];
vector[N] mu = X * b;
// initialize linear predictor term
vector[N] zi = Intercept_zi + Xc_zi * b_zi;
for (n in 1:N) {
// add more terms to the linear predictor
mu[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n] + r_1_4[J_1[n]] * Z_1_4[n] + r_1_5[J_1[n]] * Z_1_5[n];
}
for (n in 1:N) {
// add more terms to the linear predictor
zi[n] += r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n] + r_2_zi_4[J_2[n]] * Z_2_zi_4[n] + r_2_zi_5[J_2[n]] * Z_2_zi_5[n];
}
}

``````

Then I tried calculating the predictions using draws of those parameters:

``````Preds <-
spread_draws(Ideal_Fit, c(mu, zi)[rownum], ndraws = 250) %>%
group_by(rownum) %>%
mutate(mu = plogis(mu),
zi = plogis(zi),
theta = runif(250, 0,1)) %>%
ungroup() %>%
mutate(Pred = ifelse(theta < zi, 0, rbinom(1, size = 1, prob = mu)))

``````

But still the predictions are way off. Whereas if I fit the same model with `brms` and use `add_predicted_draws`, the predictions are much, much better.

I’m at a loss.

I tried first fitting everything in `brms` then taking the `stancode` and `standata` from the fitted model and putting those back in to an empty `brms` model:

``````Mod\$fit <- read_stan_csv(cmd_Fit\$output_files())

Mod <- rename_pars(Mod)

``````

Then using `add_predicted_draws` from `tidybayes` on both the `brms` model and the one with the outside fit inserted in to the `brms` object. The results are completely different.

I’m not sure what is happening. Up next I’m trying the `rstan` backend to see if that gives me reproducible results.

The problem persists with `rstan`

It looks like you’re subsetting draws from the posterior without setting the rng seed to a constant value. You either need to set the seed or use all of the posterior draws from the model otherwise you’ll get different predictions every time you run the code.

Also, be sure to set the `seed` argument in your call to `brm` or `cmdstanr::model\$sample` to ensure reproducibility of the model itself.

1 Like

Here is a reprex:

Data.csv (38.0 KB)

``````# List of needed packages
Pkgs <- c("tidyverse",  "parallel", "cmdstanr",
"brms", "tidybayes", "tidytext")

lapply(Pkgs, require, c = T)

## Set computing options
ncores = detectCores()

Data <- read.csv(file = here("data", "Data.csv")) %>%
mutate(ID = factor(ID),
Group = factor(Group),
Condition = factor(Condition),
Type = factor(Type))
``````

`brms` model

``````## brms zero inflated binomial formula
zi_fmla <- bf(Response | trials(1) ~ 1 +
Group +
Condition +
Type +
Condition:Type +
(1 + Condition + Type | ID),
center = FALSE,
zi ~ Type + (1 + Condition + Type | ID)
)

## Set priors
zi_priors <- c(
set_prior("normal(0,1)",
class = "b",
coef = "Intercept"),

set_prior("normal(0,1)",
class = "b"),

set_prior("normal(0,1)",
class = "sd")
)

# Fit model in brms
zi_Mod <- brm(zi_fmla,
Data,
family = zero_inflated_binomial(),
prior = zi_priors,
inits = "random",
iter = 2000,
warmup = 1000,
chains = 4,
cores = ncores,
backend = "cmdstan",
normalize = FALSE,
max_treedepth = 14)
)
``````

Look at accuracy

``````Data %>%
value = "Pred",
re_formula = NULL) %>%
ungroup() %>%
mutate(Correct = ifelse(Response == Pred, 1, 0)) %>%
group_by(ID, Group, Type, .draw) %>%
summarize(Accuracy = sum(Correct)/n()) %>%
ungroup() %>%
group_by(ID, Group, Type) %>%

point_interval(Accuracy,
.width = 0.89,
.point = median,
.interval = hdci,
.simple_names = TRUE,
na.rm = TRUE) %>%
ungroup() %>%

ggplot(aes(reorder_within(ID, -Accuracy, Group), Accuracy)) +
facet_grid(Group ~ Type, scales = "free_x") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbar(aes(
ymin = .lower,
ymax = .upper),
width = 0.2,
position = position_dodge(width = 0.5)) +
scale_x_reordered("ID") +
scale_y_continuous("Accuracy", labels = scales::percent_format(accuracy = 1)) +
theme_bw() +
theme_bw() +
theme(
axis.text.x = element_text(angle = 60, hjust = 1, size = 10),
axis.text.y = element_text(size = 10),
axis.title.x = element_text(size = 12),
axis.title.y = element_text(size = 12),
plot.title = element_text(hjust = 0.5),
panel.border = element_rect(fill = NA, colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black"),
strip.text.x = element_text(size = 12),
strip.text.y = element_text(size = 12),
strip.background = element_blank(),
legend.title = element_blank(),
legend.text = element_text(size = 10),
legend.position = c(0.95, 0.05),
legend.key = element_rect(fill = "transparent"),
legend.direction = "horizontal",
legend.background = element_rect(fill = "transparent", size = 6)
)

``````

Now fit the model with Stan code generated by `stancode(zi_Mod)`

``````# Model code
Stan_Code <- "functions {
/* compute correlated group-level effects
* Args:
*   z: matrix of unscaled group-level effects
*   SD: vector of standard deviation parameters
*   L: cholesky factor correlation matrix
* Returns:
*   matrix of scaled group-level effects
*/
matrix scale_r_cor(matrix z, vector SD, matrix L) {
// r is stored in another dimension order than z
return transpose(diag_pre_multiply(SD, L) * z);
}
/* zero-inflated binomial log-PDF of a single response
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   theta: probability parameter of the binomial part
*   zi: zero-inflation probability
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_lpmf(int y, int trials,
real theta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_lpmf(1 | zi),
bernoulli_lpmf(0 | zi) +
binomial_lpmf(0 | trials, theta));
} else {
return bernoulli_lpmf(0 | zi) +
binomial_lpmf(y | trials, theta);
}
}
/* zero-inflated binomial log-PDF of a single response
* logit parameterization of the zero-inflation part
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   theta: probability parameter of the binomial part
*   zi: linear predictor for zero-inflation part
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_logit_lpmf(int y, int trials,
real theta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_logit_lpmf(1 | zi),
bernoulli_logit_lpmf(0 | zi) +
binomial_lpmf(0 | trials, theta));
} else {
return bernoulli_logit_lpmf(0 | zi) +
binomial_lpmf(y | trials, theta);
}
}
/* zero-inflated binomial log-PDF of a single response
* logit parameterization of the binomial part
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   eta: linear predictor for binomial part
*   zi: zero-inflation probability
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_blogit_lpmf(int y, int trials,
real eta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_lpmf(1 | zi),
bernoulli_lpmf(0 | zi) +
binomial_logit_lpmf(0 | trials, eta));
} else {
return bernoulli_lpmf(0 | zi) +
binomial_logit_lpmf(y | trials, eta);
}
}
/* zero-inflated binomial log-PDF of a single response
* logit parameterization of the binomial part
* logit parameterization of the zero-inflation part
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   eta: linear predictor for binomial part
*   zi: linear predictor for zero-inflation part
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_blogit_logit_lpmf(int y, int trials,
real eta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_logit_lpmf(1 | zi),
bernoulli_logit_lpmf(0 | zi) +
binomial_logit_lpmf(0 | trials, eta));
} else {
return bernoulli_logit_lpmf(0 | zi) +
binomial_logit_lpmf(y | trials, eta);
}
}
// zero-inflated binomial log-CCDF and log-CDF functions
real zero_inflated_binomial_lccdf(int y, int trials, real theta, real zi) {
return bernoulli_lpmf(0 | zi) + binomial_lccdf(y | trials, theta);
}
real zero_inflated_binomial_lcdf(int y, int trials, real theta, real zi) {
return log1m_exp(zero_inflated_binomial_lccdf(y | trials, theta, zi));
}
}
data {
int<lower=1> N;  // total number of observations
int Y[N];  // response variable
int trials[N];  // number of trials
int<lower=1> K;  // number of population-level effects
matrix[N, K] X;  // population-level design matrix
int<lower=1> K_zi;  // number of population-level effects
matrix[N, K_zi] X_zi;  // population-level design matrix
// data for group-level effects of ID 1
int<lower=1> N_1;  // number of grouping levels
int<lower=1> M_1;  // number of coefficients per level
int<lower=1> J_1[N];  // grouping indicator per observation
// group-level predictor values
vector[N] Z_1_1;
vector[N] Z_1_2;
vector[N] Z_1_3;
int<lower=1> NC_1;  // number of group-level correlations
// data for group-level effects of ID 2
int<lower=1> N_2;  // number of grouping levels
int<lower=1> M_2;  // number of coefficients per level
int<lower=1> J_2[N];  // grouping indicator per observation
// group-level predictor values
vector[N] Z_2_zi_1;
vector[N] Z_2_zi_2;
vector[N] Z_2_zi_3;
int<lower=1> NC_2;  // number of group-level correlations
int prior_only;  // should the likelihood be ignored?
}
transformed data {
int Kc_zi = K_zi - 1;
matrix[N, Kc_zi] Xc_zi;  // centered version of X_zi without an intercept
vector[Kc_zi] means_X_zi;  // column means of X_zi before centering
for (i in 2:K_zi) {
means_X_zi[i - 1] = mean(X_zi[, i]);
Xc_zi[, i - 1] = X_zi[, i] - means_X_zi[i - 1];
}
}
parameters {
vector[K] b;  // population-level effects
vector[Kc_zi] b_zi;  // population-level effects
real Intercept_zi;  // temporary intercept for centered predictors
vector<lower=0>[M_1] sd_1;  // group-level standard deviations
matrix[M_1, N_1] z_1;  // standardized group-level effects
cholesky_factor_corr[M_1] L_1;  // cholesky factor of correlation matrix
vector<lower=0>[M_2] sd_2;  // group-level standard deviations
matrix[M_2, N_2] z_2;  // standardized group-level effects
cholesky_factor_corr[M_2] L_2;  // cholesky factor of correlation matrix
}
transformed parameters {
matrix[N_1, M_1] r_1;  // actual group-level effects
// using vectors speeds up indexing in loops
vector[N_1] r_1_1;
vector[N_1] r_1_2;
vector[N_1] r_1_3;
matrix[N_2, M_2] r_2;  // actual group-level effects
// using vectors speeds up indexing in loops
vector[N_2] r_2_zi_1;
vector[N_2] r_2_zi_2;
vector[N_2] r_2_zi_3;
// compute actual group-level effects
r_1 = scale_r_cor(z_1, sd_1, L_1);
r_1_1 = r_1[, 1];
r_1_2 = r_1[, 2];
r_1_3 = r_1[, 3];
// compute actual group-level effects
r_2 = scale_r_cor(z_2, sd_2, L_2);
r_2_zi_1 = r_2[, 1];
r_2_zi_2 = r_2[, 2];
r_2_zi_3 = r_2[, 3];
}
model {
// likelihood not including constants
if (!prior_only) {
// initialize linear predictor term
vector[N] mu = X * b;
// initialize linear predictor term
vector[N] zi = Intercept_zi + Xc_zi * b_zi;
for (n in 1:N) {
// add more terms to the linear predictor
mu[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n];
}
for (n in 1:N) {
// add more terms to the linear predictor
zi[n] += r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n];
}
for (n in 1:N) {
target += zero_inflated_binomial_blogit_logit_lpmf(Y[n] | trials[n], mu[n], zi[n]);
}
}
// priors not including constants
target += normal_lupdf(b[1] | 0,1);
target += normal_lupdf(b[2] | 0,1);
target += normal_lupdf(b[3] | 0,1);
target += normal_lupdf(b[4] | 0,1);
target += normal_lupdf(b[5] | 0,1);
target += logistic_lupdf(Intercept_zi | 0, 1);
target += normal_lupdf(sd_1 | 0,1);
target += std_normal_lupdf(to_vector(z_1));
target += lkj_corr_cholesky_lupdf(L_1 | 1);
target += student_t_lupdf(sd_2 | 3, 0, 2.5);
target += std_normal_lupdf(to_vector(z_2));
target += lkj_corr_cholesky_lupdf(L_2 | 1);
}
generated quantities {
// actual population-level intercept
real b_zi_Intercept = Intercept_zi - dot_product(means_X_zi, b_zi);
// compute group-level correlations
corr_matrix[M_1] Cor_1 = multiply_lower_tri_self_transpose(L_1);
vector<lower=-1,upper=1>[NC_1] cor_1;
// compute group-level correlations
corr_matrix[M_2] Cor_2 = multiply_lower_tri_self_transpose(L_2);
vector<lower=-1,upper=1>[NC_2] cor_2;
// extract upper diagonal of correlation matrix
for (k in 1:M_1) {
for (j in 1:(k - 1)) {
cor_1[choose(k - 1, 2) + j] = Cor_1[j, k];
}
}
// extract upper diagonal of correlation matrix
for (k in 1:M_2) {
for (j in 1:(k - 1)) {
cor_2[choose(k - 1, 2) + j] = Cor_2[j, k];
}
}
}
"

``````

Compile and fit

``````# Compile cmdstan program
Cmd_Mod <- cmdstan_model(stan_file = write_stan_file(Stan_Code),
compile = TRUE)

## Sample from model
Fit <-Cmd_Mod\$sample(
data = standata(zi_Mod),
iter_warmup = 1000,
iter_sampling = 2000,
chains = 4,
max_treedepth = 14,

``````

# Fit an empty `brms` model

``````empty_brms_Mod <- brm(bf(Response | trials(1) ~ 1 +
Group +
Condition +
Type +
Condition:Type +
(1 + Condition + Type | ID),
center = FALSE,
zi ~ Type + (1 + Condition + Type | ID)),
Data,
family = zero_inflated_binomial(),
iter = 2000,
warmup = 1000,
chains = 4,
cores = ncores,
empty = TRUE,
backend = "cmdstan",
normalize = FALSE,
max_treedepth = 14)
)
``````

Replace the `fit` slot with the rstan fit

``````empty_brms_Mod\$fit <- read_stan_csv(Fit\$output_files())

empty_brms_Mod <- rename_pars(empty_brms_Mod)

``````

Look at accuracy

``````Data %>%
value = "Pred",
re_formula = NULL) %>%
ungroup() %>%
mutate(Correct = ifelse(Response == Pred, 1, 0)) %>%
group_by(ID, Group, Type, .draw) %>%
summarize(Accuracy = sum(Correct)/n()) %>%
ungroup() %>%
group_by(ID, Group, Type) %>%

point_interval(Accuracy,
.width = 0.89,
.point = median,
.interval = hdci,
.simple_names = TRUE,
na.rm = TRUE) %>%
ungroup() %>%

ggplot(aes(reorder_within(ID, -Accuracy, Group), Accuracy)) +
facet_grid(Group ~ Type, scales = "free_x") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbar(aes(
ymin = .lower,
ymax = .upper),
width = 0.2,
position = position_dodge(width = 0.5)) +
scale_x_reordered("ID") +
scale_y_continuous("Accuracy", labels = scales::percent_format(accuracy = 1)) +
theme_bw() +
theme_bw() +
theme(
axis.text.x = element_text(angle = 60, hjust = 1, size = 10),
axis.text.y = element_text(size = 10),
axis.title.x = element_text(size = 12),
axis.title.y = element_text(size = 12),
plot.title = element_text(hjust = 0.5),
panel.border = element_rect(fill = NA, colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black"),
strip.text.x = element_text(size = 12),
strip.text.y = element_text(size = 12),
strip.background = element_blank(),
legend.title = element_blank(),
legend.text = element_text(size = 10),
legend.position = c(0.95, 0.05),
legend.key = element_rect(fill = "transparent"),
legend.direction = "horizontal",
legend.background = element_rect(fill = "transparent", size = 6)
)
``````

I’d like to use generated quantities rather than fitting an empty model and replacing the fit slot.

When I try with generated quantities, I get a very different set of predictions:

``````## Model code
Stan_Code <- "functions {
/* compute correlated group-level effects
* Args:
*   z: matrix of unscaled group-level effects
*   SD: vector of standard deviation parameters
*   L: cholesky factor correlation matrix
* Returns:
*   matrix of scaled group-level effects
*/
matrix scale_r_cor(matrix z, vector SD, matrix L) {
// r is stored in another dimension order than z
return transpose(diag_pre_multiply(SD, L) * z);
}
/* zero-inflated binomial log-PDF of a single response
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   theta: probability parameter of the binomial part
*   zi: zero-inflation probability
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_lpmf(int y, int trials,
real theta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_lpmf(1 | zi),
bernoulli_lpmf(0 | zi) +
binomial_lpmf(0 | trials, theta));
} else {
return bernoulli_lpmf(0 | zi) +
binomial_lpmf(y | trials, theta);
}
}
/* zero-inflated binomial log-PDF of a single response
* logit parameterization of the zero-inflation part
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   theta: probability parameter of the binomial part
*   zi: linear predictor for zero-inflation part
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_logit_lpmf(int y, int trials,
real theta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_logit_lpmf(1 | zi),
bernoulli_logit_lpmf(0 | zi) +
binomial_lpmf(0 | trials, theta));
} else {
return bernoulli_logit_lpmf(0 | zi) +
binomial_lpmf(y | trials, theta);
}
}
/* zero-inflated binomial log-PDF of a single response
* logit parameterization of the binomial part
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   eta: linear predictor for binomial part
*   zi: zero-inflation probability
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_blogit_lpmf(int y, int trials,
real eta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_lpmf(1 | zi),
bernoulli_lpmf(0 | zi) +
binomial_logit_lpmf(0 | trials, eta));
} else {
return bernoulli_lpmf(0 | zi) +
binomial_logit_lpmf(y | trials, eta);
}
}
/* zero-inflated binomial log-PDF of a single response
* logit parameterization of the binomial part
* logit parameterization of the zero-inflation part
* Args:
*   y: the response value
*   trials: number of trials of the binomial part
*   eta: linear predictor for binomial part
*   zi: linear predictor for zero-inflation part
* Returns:
*   a scalar to be added to the log posterior
*/
real zero_inflated_binomial_blogit_logit_lpmf(int y, int trials,
real eta, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_logit_lpmf(1 | zi),
bernoulli_logit_lpmf(0 | zi) +
binomial_logit_lpmf(0 | trials, eta));
} else {
return bernoulli_logit_lpmf(0 | zi) +
binomial_logit_lpmf(y | trials, eta);
}
}
// zero-inflated binomial log-CCDF and log-CDF functions
real zero_inflated_binomial_lccdf(int y, int trials, real theta, real zi) {
return bernoulli_lpmf(0 | zi) + binomial_lccdf(y | trials, theta);
}
real zero_inflated_binomial_lcdf(int y, int trials, real theta, real zi) {
return log1m_exp(zero_inflated_binomial_lccdf(y | trials, theta, zi));
}
}
data {
int<lower=1> N;  // total number of observations
int Y[N];  // response variable
int trials[N];  // number of trials
int<lower=1> K;  // number of population-level effects
matrix[N, K] X;  // population-level design matrix
int<lower=1> K_zi;  // number of population-level effects
matrix[N, K_zi] X_zi;  // population-level design matrix
// data for group-level effects of ID 1
int<lower=1> N_1;  // number of grouping levels
int<lower=1> M_1;  // number of coefficients per level
int<lower=1> J_1[N];  // grouping indicator per observation
// group-level predictor values
vector[N] Z_1_1;
vector[N] Z_1_2;
vector[N] Z_1_3;
int<lower=1> NC_1;  // number of group-level correlations
// data for group-level effects of ID 2
int<lower=1> N_2;  // number of grouping levels
int<lower=1> M_2;  // number of coefficients per level
int<lower=1> J_2[N];  // grouping indicator per observation
// group-level predictor values
vector[N] Z_2_zi_1;
vector[N] Z_2_zi_2;
vector[N] Z_2_zi_3;
int<lower=1> NC_2;  // number of group-level correlations
int prior_only;  // should the likelihood be ignored?
}
transformed data {
int Kc_zi = K_zi - 1;
matrix[N, Kc_zi] Xc_zi;  // centered version of X_zi without an intercept
vector[Kc_zi] means_X_zi;  // column means of X_zi before centering
for (i in 2:K_zi) {
means_X_zi[i - 1] = mean(X_zi[, i]);
Xc_zi[, i - 1] = X_zi[, i] - means_X_zi[i - 1];
}
}
parameters {
vector[K] b;  // population-level effects
vector[Kc_zi] b_zi;  // population-level effects
real Intercept_zi;  // temporary intercept for centered predictors
vector<lower=0>[M_1] sd_1;  // group-level standard deviations
matrix[M_1, N_1] z_1;  // standardized group-level effects
cholesky_factor_corr[M_1] L_1;  // cholesky factor of correlation matrix
vector<lower=0>[M_2] sd_2;  // group-level standard deviations
matrix[M_2, N_2] z_2;  // standardized group-level effects
cholesky_factor_corr[M_2] L_2;  // cholesky factor of correlation matrix
}
transformed parameters {
matrix[N_1, M_1] r_1;  // actual group-level effects
// using vectors speeds up indexing in loops
vector[N_1] r_1_1;
vector[N_1] r_1_2;
vector[N_1] r_1_3;
matrix[N_2, M_2] r_2;  // actual group-level effects
// using vectors speeds up indexing in loops
vector[N_2] r_2_zi_1;
vector[N_2] r_2_zi_2;
vector[N_2] r_2_zi_3;
// compute actual group-level effects
r_1 = scale_r_cor(z_1, sd_1, L_1);
r_1_1 = r_1[, 1];
r_1_2 = r_1[, 2];
r_1_3 = r_1[, 3];
// compute actual group-level effects
r_2 = scale_r_cor(z_2, sd_2, L_2);
r_2_zi_1 = r_2[, 1];
r_2_zi_2 = r_2[, 2];
r_2_zi_3 = r_2[, 3];
}
model {
// likelihood not including constants
if (!prior_only) {
// initialize linear predictor term
vector[N] mu = X * b;
// initialize linear predictor term
vector[N] zi = Intercept_zi + Xc_zi * b_zi;
for (n in 1:N) {
// add more terms to the linear predictor
mu[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n];
}
for (n in 1:N) {
// add more terms to the linear predictor
zi[n] += r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n];
}
for (n in 1:N) {
target += zero_inflated_binomial_blogit_logit_lpmf(Y[n] | trials[n], mu[n], zi[n]);
}
}
// priors not including constants
target += normal_lupdf(b[1] | 0,1);
target += normal_lupdf(b[2] | 0,1);
target += normal_lupdf(b[3] | 0,1);
target += normal_lupdf(b[4] | 0,1);
target += normal_lupdf(b[5] | 0,1);
target += logistic_lupdf(Intercept_zi | 0, 1);
target += normal_lupdf(sd_1 | 0,1);
target += std_normal_lupdf(to_vector(z_1));
target += lkj_corr_cholesky_lupdf(L_1 | 1);
target += student_t_lupdf(sd_2 | 3, 0, 2.5);
target += std_normal_lupdf(to_vector(z_2));
target += lkj_corr_cholesky_lupdf(L_2 | 1);
}
generated quantities {
vector[N] mu = X * b;
vector[N] zi = Intercept_zi + Xc_zi * b_zi;
vector[N] mu_pred;
vector[N] zi_pred;
vector[N] Y_pred;
vector[N] theta;
theta[N] = uniform_rng(0,1);
// actual population-level intercept
real b_zi_Intercept = Intercept_zi - dot_product(means_X_zi, b_zi);
// compute group-level correlations
corr_matrix[M_1] Cor_1 = multiply_lower_tri_self_transpose(L_1);
vector<lower=-1,upper=1>[NC_1] cor_1;
// compute group-level correlations
corr_matrix[M_2] Cor_2 = multiply_lower_tri_self_transpose(L_2);
vector<lower=-1,upper=1>[NC_2] cor_2;
// extract upper diagonal of correlation matrix
for (k in 1:M_1) {
for (j in 1:(k - 1)) {
cor_1[choose(k - 1, 2) + j] = Cor_1[j, k];
}
}
// extract upper diagonal of correlation matrix
for (k in 1:M_2) {
for (j in 1:(k - 1)) {
cor_2[choose(k - 1, 2) + j] = Cor_2[j, k];
}
}
for (n in 1:N) {
mu_pred[n] = mu[n] + r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n];

zi_pred[n] = zi[n] +r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n];
if (theta[n] < inv_logit(zi_pred[n])) Y_pred[n] = 0;
else
Y_pred[n] = binomial_rng(trials[n], inv_logit(mu_pred[n]));
}
}
"

``````

Compile and fit

``````# Compile stan program
Cmd_Mod <- cmdstan_model(stan_file = write_stan_file(Stan_Code),
compile = TRUE)

# Sample from compiled model
Fit <-Cmd_Mod\$sample(
data = standata(zi_Mod),
iter_warmup = 1000,
iter_sampling = 2000,
chains = 4,
max_treedepth = 14,

``````

Look at accuracy

``````# Get samples of Y_pred
pred_samples <- draws_of(as_draws_rvars(Fit\$draws())\$Y_pred)

# Plot
cbind(Data, as_tibble(t(pred_samples))) %>%
pivot_longer(`1`:`8000`, names_to = ".draw", values_to = "Pred") %>%
mutate(.draw = as.numeric(.draw)) %>%
ungroup() %>%
mutate(Correct = ifelse(Response == Pred, 1, 0)) %>%
group_by(ID, Group, Type, .draw) %>%
summarize(Accuracy = sum(Correct)/n()) %>%
ungroup() %>%
group_by(ID, Group, Type) %>%

point_interval(Accuracy,
.width = 0.89,
.point = median,
.interval = hdci,
.simple_names = TRUE,
na.rm = TRUE) %>%
ungroup() %>%

ggplot(aes(ID, Accuracy)) +
facet_grid(Group ~ Type, scales = "free_x") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbar(aes(
ymin = .lower,
ymax = .upper),
width = 0.2,
position = position_dodge(width = 0.5)) +
scale_x_discrete("ID") +
scale_y_continuous("Accuracy", labels = scales::percent_format(accuracy = 1)) +
theme_bw() +
theme_bw() +
theme(
axis.text.x = element_text(angle = 60, hjust = 1, size = 10),
axis.text.y = element_text(size = 10),
axis.title.x = element_text(size = 12),
axis.title.y = element_text(size = 12),
plot.title = element_text(hjust = 0.5),
panel.border = element_rect(fill = NA, colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black"),
strip.text.x = element_text(size = 12),
strip.text.y = element_text(size = 12),
strip.background = element_blank(),
legend.title = element_blank(),
legend.text = element_text(size = 10),
legend.position = c(0.95, 0.05),
legend.key = element_rect(fill = "transparent"),
legend.direction = "horizontal",
legend.background = element_rect(fill = "transparent", size = 6)
)

``````

The right-hand facet looks comparable, but the left-hand facet is way off.

Is the problem in how I’m calculating the predicted responses?

The problem was in how I was creating `theta` in generated quantities.

``````generated quantities {
vector[N] mu = X * b;
vector[N] zi = Intercept_zi + Xc_zi * b_zi;
vector[N] mu_pred;
vector[N] zi_pred;
vector[N] Y_pred;
vector[N] theta;
// theta[N] = uniform_rng(0,1);
// actual population-level intercept
real b_zi_Intercept = Intercept_zi - dot_product(means_X_zi, b_zi);
// compute group-level correlations
corr_matrix[M_1] Cor_1 = multiply_lower_tri_self_transpose(L_1);
vector<lower=-1,upper=1>[NC_1] cor_1;
// compute group-level correlations
corr_matrix[M_2] Cor_2 = multiply_lower_tri_self_transpose(L_2);
vector<lower=-1,upper=1>[NC_2] cor_2;
// extract upper diagonal of correlation matrix
for (k in 1:M_1) {
for (j in 1:(k - 1)) {
cor_1[choose(k - 1, 2) + j] = Cor_1[j, k];
}
}
// extract upper diagonal of correlation matrix
for (k in 1:M_2) {
for (j in 1:(k - 1)) {
cor_2[choose(k - 1, 2) + j] = Cor_2[j, k];
}
}
for (n in 1:N) {
theta[n] = uniform_rng(0,1);
mu_pred[n] = mu[n] + r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n];
zi_pred[n] = zi[n] + r_2_zi_1[J_2[n]] * Z_2_zi_1[n] + r_2_zi_2[J_2[n]] * Z_2_zi_2[n] + r_2_zi_3[J_2[n]] * Z_2_zi_3[n];

if (theta[n] < inv_logit(zi_pred[n])) Y_pred[n] = 0;
else
Y_pred[n] = binomial_rng(trials[n], inv_logit(mu_pred[n]));
}
}
``````

1 Like