Happy New Year! This question could be filed under the category of ‘Stan makes even people with questionable stats knowledge very powerful/dangerous)’.

I have a linear logistic predictive probability model that seems to work well (with major caveats).

I’m interested in the probabilities themselves… So I calculate them from the generated y_tildas samples.

I then compare the actual outcome for xx with the predicted probabilities implied by the samples (e.g) …

When I bin the predictive probabilities and look at the actual outcome, I get the following Z scores (assuming a binomial process/variance at the average probability of the samples in that bin).

Lastly here are the samples sizes for each percentile bin:

So basically at very high and very low levels of probability, the process does not seem to fit a binomial process given by the implied probabilities of the generated samples. I’ve tried nonlinear terms, but they are not significant.

I think the process (even at tremendously high/low levels of input) have an inherently random quality that is not captured in the model or type of model. Any advice is deeply appreciated.

Thank you!!