Before providing the details of the actual model, I want to first collect some general ideas regarding how to adjust model specification based on posterior predictive check result if the model seems to be misspecified. One thing I know is that for example if I am modeling count data using a Poisson distribution and the observed data show more 0s than predicted data, it is better to use negative binomial distribution instead of Poisson distribution. But besides that, I am not sure what specific instructions could posterior predictive check give.
In my case, I am modeling count data with zero-inflated negative binomial distribution and random effect. The data drawn from the posterior predictive distribution is a lot larger than the observed data. Below is the QQ plot with x-axis representing observed data and y-axis representing predicted data. The line is the x=y line.
In addition, by checking the proportion of 0s in the observed data vs. the predicted data, I found that the proportion of 0s in the predicted data is actually higher than the observed data. So I switched to negative binomial distribution but it doesn’t help at all. I was wondering if there is any general advice on this type of situation? I will provide more details of the model if someone is interested in taking a closer look.
Some additional quick questions:
- Will model reparameterization affect posterior predictive check? Another way of asking this question is that, will reparameterization change a misspecified model into a more correctly specified model?
- I have also seen high autocorrelation during model fitting. Is this related to the specification of the model or it is an independent issue that is related to other things like sampler and number of iteration?