This question is primarily about the interpretation of diagnostics.
Here’s my situation: I fit a logistic regression model to predict the outcome of a binary event. The model fits without issue and I’m current in the process of doing model checking.
Part of my model checking has included using arviz
to create loo_pit
plots. The first is the default and the second is “the difference between the LOO-PIT Empirical Cumulative Distribution Function (ECDF) and the uniform CDF”.
Here are the plots:
In this first plot, I feel like the fit is mostly okay with an issue around 0.3.
The ECDF plot is below:
Again, I believe to see the issues around 0.3, but here I’m not quite sure what the plot straying out of the credible interval around 0.7-0.8 tell me. The first plot looks fine in that region.
My questions are as follows:
- In the first plot, what kind of conclusions can I draw from the visual deviations from the Uniform distribution other than “my model has some issues in certain areas”?
a) I’ve taken a look at posts such as this as well as reading the relevant section in BDA3 to get a better idea of what I’m looking at, but I still seem to not quite know how to interpret these plots. - How does the second graph differ in the information it’s giving me compared to the first? I feel like I’m seeing some contradictory information with the second graph straying out of the 94% credible interval around 0.7-0.8, but that region looks perfectly in line in the first graph.
- If you were to see these diagnostic graphs, what would your next step in finding issues be? I certainly have some posterior predictive checks in mind, but what I’m interested in is: do these LOO-PIT graphs inspire particular posterior predictive checks?
I apologize if the questions aren’t particularly informed, LOO-PIT is new to me. Appreciate any responses!