# Posterior predictive checking

I am doing on Bayesian analysis and want to check posterior predictive using ppc_dens_overlay(y, y_hat[1:200,]) and got the following figure. So, do you think that our model is reasonable good?

Without knowing more, I would say no. The model does not seem to recover y appropriately.

2 Likes

If you’re trying to model count data then try the negative-Binomial instead (a Poisson needs the mean to equal the variance).

3 Likes

According to your comment I have tried the negative binomial instead of Poisson using the following command:

data {
int<lower=1> N;
int <lower=2> y[N];
matrix[N,3] x;
vector[N] zi;
real<lower=0> zmi;
}

parameters {
real<lower=0> ci;
vector[3] beta;
real alpha;
vector<lower=0, upper=1>[N] pi;
real<lower=0> phi[N];
}

transformed parameters{
vector[N] mu[2];
mu[1] = exp(alpha + x*beta);
mu[2] = exp((alpha + x*beta)+ci);
}

model {
beta ~ normal(0,3);
alpha ~ normal(0,3);
ci ~ normal(0,3);
pi ~ beta(zi,zmi);
phi ~ cauchy(0,5);

for(i in 1:N)
target += log_mix(pi[i],
neg_binomial_2_lpmf(y[i]|mu[1,i], phi[i]) - neg_binomial_2_lccdf(1|mu[1,i], phi[i]),
neg_binomial_2_lpmf(y[i]|mu[2,i], phi[i]) - neg_binomial_2_lccdf(1|mu[2,i], phi[i]));
}

generated quantities{
int y_hat[N];
for(i in 1:N) {
int which = bernoulli_rng(pi[i]);
y_hat[i] = neg_binomial_2_rng(mu[which + 1, i], phi[i]);
}
}

It is very slow and got the following error.

Chain 1: Exception: grad_2F1: k (internal counter) exceeded 100000 iterations, hypergeometric function gradient did not converge.  (in 'model41e4d3310fe7e_ae6acbde7c6fdc5a09596af5fc24ee41' at line 55)

[1] "Error in sampler\$call_sampler(args_list[[i]]) : "
[2] "  Exception: grad_2F1: k (internal counter) exceeded 100000 iterations, hypergeometric function gradient did not converge.  (in 'model41e4d3310fe7e_ae6acbde7c6fdc5a09596af5fc24ee41' at line 55)"

FYI: The model is truncated at (y>1)

Hmm, never seen that error msg. I guess it might have to do with you truncating a negbinomial to >1. Perhaps @Max_Mantei could clarify.

1 Like

I’ve never seen this error before. I guess it’s coming from neg_binomial_2_lccdf, which apparently has not-so-nice gradients.

While being a nice distribution for modeling, the negative binomial can sometimes be a pain to fit in my experience. Maybe it’s a stupid idea, but what if you define y_ast = y - 2 and then model y^*?

Also, I think it’s no big deal, but \mu_2 = \exp((\alpha + X\beta)+ci) = \exp(\alpha + X\beta)\exp(ci) = \mu_1 \cdot \exp(ci), right?

Putting these two ideas together you could try

data {
...
}
transformed data{
int y_ast[N]
for (n in 1:N)
y_ast[n] = y[n] - 2;
}
parameters {
...
}
transformed parameters{
vector[N] mu = exp(alpha + x*beta);
}
model {

...

for(i in 1:N)
target += log_mix(pi[i],
neg_binomial_2_lpmf(y_ast[i]|mu[i], phi[i]),
neg_binomial_2_lpmf(y_ast[i]|mu[i]*exp(ci), phi[i]));
}

generated quantities{
int y_hat[N];
for(i in 1:N) {
int which = bernoulli_rng(pi[i]);
real mu_hat = which == 1 ? mu[i]*exp(ci) : mu[i];
y_hat[i] = 2 + neg_binomial_2_rng(mu_hat, phi[i]);
}
}

Just some quick thoughts…

1 Like

@torkar @Max_Mantei I incorporated the comments and the errors are gone. But, still I got lot of mu's with NA value. I think some values are over estimated. What are the possible reasons of it?

The result of Poisson and Negative Binomial are posted here. Therefore, do you think that the model recovers y?

If you have \mu with NAs you have a problem. Hmm… I have to think about this. And let’s see what @Max_Mantei has to say.

1 Like

Hm, it’s a bit weird that you get NAs and I agree with @torkar that that’s a problem. However, the results you posted seem reasonable. Both models are fairly similar: The \beta s are very similar for both models. For the NegBinom ci is centered around 1 and in the Poisson model it is considerably larger (mean around 2.4), which indicates that in the Poisson case ci acts as a kind of overdispersion parameter.

What are the summaries for phi and, more interestingly, for pi?

Also, I think you can use bayesplot to check on more that just the mean of y_rep. The variance would be interesting to look at. You can also simulate data for specific (representative or otherwise interesting) values of x and see if they line up with the true y.

As for the NA s… Hm, did you get error messages or warnings? The Poisson and NegBinom are known to overlow sometimes. It’s a bit puzzling that you got NAs, since inference on y_rep looks fine… Can you post the full model codes again (for both models) with the data that you’ve used?

Cheers!

1 Like

in my understanding, it inflated the value of mu and the _rng value become very large and it is above the Stan boundary. Also, getting such kind of value is not logical in my study (in the data the maximum value is 11). And therefore, when I compile the results of mu's as a mean of iteration, it will give NA. In short, the problems are in the large value of mu's.

when I run the command, I got the following error message in Poisson and Negative Binomial respectively.

Chain 1: Exception: poisson_rng: Rate parameter is 3.93004e+009, but must be less than 1.07374e+009 (in 'model1ccc7b1634d9_4777964a4776825856bcc2b668400d73' at line 62)

Exception: neg_binomial_2_rng: Random number that came from gamma distribution is 2.42065e+09, but must be less than 1.07374e+09 (in 'model374b74903a9de_c40e9cc628eaff01ac4aa8574b302991' at line 63)

Hey! A few thoughts…

### Warm-up?

Hm, I guess these values only occur during warm-up, right? Because in the plot, that you posted, y_rep looks fine (and those come from post-warmup-up iterations). You could try running stan with the save_warmup = FALSE option and see if you still get that error.

### Priors?

Also, you could tweak your priors a bit, especially those of the NegBinom. For example, cauchy(0,5) can have extremely large values and having this as a prior for overdispersion is perhaps overkill. You could try

...
parameters {
...
real log_phi[N];
}
...
model{
...
log_phi ~ normal(0, 1);
...
... neg_binomial_2_lpmf(y[i]|mu[1,i], exp(log_phi[i])) ...
}

this way you have the prior median for \phi at 1, which implies “no over- or under-dispersion”, but have some wiggle-room to go above or below, putting equal prior probability on over- and under-dispersion.

(Also, just as an aside… Letting \phi vary by observation strikes me as unusual. It’s not that it couldn’t be reasonable, but I think normally you’d just have one parameter \phi that measures the dispersion of the whole distribution—like the \sigma in linear regression. Maybe having just one common \phi will give you a bit more stable inference?)

In general, you’d want to do proper prior predictive checks. There’s a lot of good resources on that for you to read, if your not familiar with it. It boils down to do the same as with posterior predictive checks, but using only the prior.

### RNGs?

If you still get the error, then you could try something like this (look at this thread for reference):

real mu_hat = which == 1 ? mu[i]*exp(ci) : mu[i];
if (log(mu_hat) > 20){ // set some reasonable threshold here.. like a really big number
y_hat[i] = 9999999; // set something here, that you want to put into y_hat in case mu is really large. Or some value that you can "filter" out
} else {
y_hat[i] = 2 + neg_binomial_2_rng(mu_hat, phi[i]);
}

… this is super hacky (and for the values I’ve put in probably even a bit ridiculous), but I think you get the idea.

### Likelihood?

However, you have said, that the maximum you’d expect y to be 11. So, basically the outcome is an integer y \in [2,11]? Are values beyond 11 possible? If no, then I think you should have a look into modeling your data using a Binomial distribution. To get random integer y between (and including) 2 and 11, you could try in the model block binomial_logit(y_ast | 9, alpha + x*beta) and so on. Further down, in generated quantities it would be something like y_hat[i] = 2 + binomial_rng(9, inv_logit(alpha + x[i]*beta)). Note, that now your coefficients would be on the logit-scale, so you’d have to change your priors accordingly (like I said above, look into prior predictive checks if you haven’t already, it’s a great way of coming up with reasonable priors).

In case you want to incorporate under- or over-dispersion, you could try the Beta-Binomial distribution. But here, you’d have to re-parametrize yourself.

I hope the suggestions are helpful!

Cheers! :)

1 Like