# Bernoulli variables in generated quantities causing tail ESS warning?

I may have stumbled upon a scenario that falsely triggers the Tail ESS warning. Here’s some R code to create fake data for a logistic regression:

``````library(rstan)

set.seed(0)
B0 <- 1
B1 <- 0.5
N <- 100

x <- rnorm(N)
y <- rbinom(N, 1, exp(B0 + B1*x)/(exp(B0 + B1*x) + 1))
``````

I use the following model, which includes sample bernoulli draws (`y_rep`) for posterior predictive checks:

``````data {
int<lower = 0> N;
vector [N] x;
int<lower = 0> y[N];
}
parameters {
real B0;
real B1;
}
model {
y ~ bernoulli_logit(B0 + B1*x);
}
generated quantities {
int<lower = 0> y_rep[N];
for(n in 1:N){
y_rep[n] = bernoulli_logit_rng(B0 + B1*x[n]);
}
}
``````

And I run the model:

``````dat <- list(N = N, x = x, y = y)
fit <- stan(model_code = stanmodelcode, data = dat, seed = 0)
``````

I get the familiar warning message about Tail ESS:

Warning message:
Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.

When I check `n_eff` in the summary of the `fit` and `Tail_ESS` using `monitor`, all of the ESS and Rhat numbers are robust. And when I get rid of the generated quantities block, the model runs fine with no warnings.

I’m aware that a constant value, either as a parameter or in the generated quantities block, will mess up the Rhat and ESS calculations, triggering a warning. But none of the `y_rep` variables are all 0s or 1s.

My guess is that since `y_rep` is binary, certain diagnostics can’t be calculated, such as `MCSE_Q50` or the `MCSE_Q75`, which are `NA` in the `monitor` function. Note that running `ess_tail` on any of the `y_rep` indices also produces `NA`.

1 Like

Hi,
Thanks for clear code example. I don’t get any warnings with rstan_2.21.2.
What version are you using?

Yes this is correct. Tail-ESS diagnoses sequences

``````I <- theta <= quantile(theta, 0.05)
``````

and

``````I <- theta <= quantile(theta, 0.95)
``````

For binary data, it’s likely that `quantile(theta, 0.95)` is 1 and then the corresponding `I` is constant and ESS returns NA. `ess_tail` function returns minimum of 0.05 and 0.95 tail-ESSs, which is then NA.

Quantiles are not very useful for binary variable (there is no tail), so I would think that NA for tail-ESS is fine. We may want to think what to do for other discrete variables with a small number of observed states.

1 Like

I’m using rstan 2.19.3

That explains the difference in warnings as we fixed the the later version to not warn about NAs.

2 Likes