PPC for simple logistic regression in Stan

Dear all,

Can anyone out there share with me a snippet of code that calculates a posterior predictive check for logistic regression. Nothing fancy - I have a binary outcome (of course), and some predictors.

Thanks in advance,


generated quantities {
  int y_rep[N];
  for (n in 1:N) y_rep[n] = bernoulli_logit_rng(alpha + X[n, ] * beta);

then do stuff comparing it to the observed outcomes. The bayesplot package for R has, for example, a ppc_error_binned function to plot a comparison.

1 Like

A key feature of PPCs that is often overlooked is a summary statistic that maps the observations to something amenable to visualization. The raw 0s and 1s are hard to visualize effectively and instead I prefer to build a PPC around the empirical probability over the observation,

generated quantities {
  real p_hat_ppc = 0;

    for (n in 1:N) {
      int y_ppc = bernoulli_logit_rng(X[1:M, n]' * beta + alpha);
      p_hat_ppc = p_hat_ppc + y_ppc;
    p_hat_ppc = p_hat_ppc / N;