I think I have this working now: I’ll outline what i ended up doing (I used a combination of ideas suggested by you both @jsocolar and @maxbiostat).
- I created pointwise log-likelihoods for each unique count in the
generated quantities
block whilst fitting the model - these only needed to be calculated for each unique count, so this was not memory intensive. - (in R) I chose a subset of the iterations to peform
loo
. I expanded the pointwise LLs by their observed frequencies in the original data, whilst making sure that the order of the expanded LLs matched that of the original (non compressed) count data. - I realised that I can’t do LOO-PIT with discrete data (after staring at terrible LOO-PIT plots for some time… I eventually found this useful thread: Worst LOO-PIT plots ever. PP Checks great)
- I still simulated new data from the posterior draws from my sub-sample of iterations by calculating the vector of probabilities \alpha_{1}, ..., \alpha_{K} up to some ‘generous’ upper count, c_{K} and used these as probabilities in a multinomial distribution.
- I used these simulated values for graphical posterior predictive checks, as per Graphical posterior predictive checks using the bayesplot package • bayesplot