Thanks for tagging me on this @jgoldberg! And I think that @mbc hits the nail on the head when they say that the choice of prior/posterior predictive check really depends on the features of the ecological system that you’re most interested in capturing.

I’ve got two examples from my own work. In this paper, we included the workflow for doing both prior and posterior predictive checks in the supplementary materials.

We used multiple quantities, such as, the number never captured again after initial release, the number captured on each temporal strata at each monitoring location, the total number captured.

For prior predictive checks we were concerned with trying to make sure that the priors produced a broad but reasonable range of values. In particular one thing we were concerned with was making sure that the simulated fish were somewhat evenly distributed across the temporal strata. The concern here was that because the monitoring period was limited, we had a situation somewhat again to logit or similar transform where a too diffuse prior would result in strange U-shaped peak on the tails. And because all of the priors pieces were moving together we wanted to make sure that we didn’t accidentally produce a strong prior on the product of multiple parameters that went unexamined. (I think this is somewhat unavoidable though. Consider that the distribution of the product of two Uniform(0,1) random variables. It’s not Uniform.)

We just had a paper come out last month (author’s link) where we used posterior predictive checks with a bit of prolepsis in mind. We suspected that there might be some concerns among some readers about whether it was appropriate to pool hatchery raised and wild fish and assume that they had similar capture probabilities. So to address that we decided to separate out the two groups of fish in the posterior predictive checks (even though they were modelled pooled in many respects). The reasoning here being that if one group of fish had posterior predictive intervals that systematically under- or over-predicted, then that might give us a reason to go back in and add a term to the model to account for the difference due to origin.

We also observed something cool (well, I think it’s cool) where it looked like a higher level summary statistic (the sum of detected PIT-tagged individuals in a release group over all temporal strata) was poorly replicated (in the sense that less than 90% of the posterior predictive intervals contained the observed value). On the other hand PPIs for our more granular summary statistic had much better coverage. We think the explanation for this has to do with the precision of the estimates at the more granular level (which comes from choices in our parameterization). The consequence of high-precision was that if the PPI for one stratum for one release group was missed, then it was very likely the sum over all the strata for that release group would also miss. Once we realized that, we decided to not worry about it that summary statistic performing less well, especially because the data for the PIT-tagged fish were in some ways only ancillary to our objective of estimating abundance.

So those are some specific examples, but in general I think @jgoldberg and @mbc had it right: you should probably think about what summary statistic(s) is most revealing of whichever features you’re trying to best capture. It’s good to look at several, but they don’t all have to “hit”, especially if they’re less important features for your objective.

That said, for LOOIC, it really is worth thinking about the fact that the unit of replication (the “one” in leave-one-out) is an individual animal - which means that the log-likelihood of an individual’s capture history is an important quantity to consider.

I’ll also drop these slides for you. This is from a talk that was heavily inspired by (and liberally borrowed from) @betanalpha

An Introduction to the Bayesian Workflow for Mark-Recapture.pdf (2.1 MB)

It’s more than a few years old now, but might be of use to you.