Posterior predictive one dataset per sample or block of samples?

Let’s say we have a given dataset with 20 observations and we’ve fit a mixed-effects model, producing 2000 posterior samples. We want to generate 100 datasets of the same size as the given dataset for comparison.

Do we need to

  1. take 100 of the 2000 samples (discarding the remaining 1900), and generate a single 20-obs dataset per sample, including drawing multiple random-effect levels from that single posterior sample, or

  2. can we draw a single observation per sample for each of the 2000 samples, drawing only a single level per sample, and aggregate a 20-obs dataset by combining each block of 20 separate samples, resulting in 100 datasets?

If you have access, you might want to take a look at chapter 12 of @richard_mcelreath 's book, Statistical Rethinking. The last section of the chapter covers posterior predictions for multilevel / mixed effects models. If you don’t have access, you can also check out this lecture on the author’s YouTube which covers chapter 12. Looks like posterior predictions starts around 50 minutes

Edit: looks like the book sample includes chapter 12

2 Likes

I think my question was too ambiguous. I want to reduce the given dataset down to a single number (summary statistic) by applying some function to the responses, such as mean, max, min, median, etc. I also want to apply the same function to the responses of each of the 100 datasets drawn from the posterior. I want to do this by sampling new random-effects levels (or in McElreath’s words “clusters”). Then I want to compare the summary statistic obtained from the given dataset to the summary statistics obtained from the 100 datasets drawn from the posterior.

Neither McElreath’s lecture nor Chapter 12 of his book addresses the question I have. I’m not asking about the difference between sampling the average cluster or sampling a new cluster (i.e. not just the difference between sections 12.4.1 and 12.4.2). My question relates to sampling a new cluster, and the difference between two ways of doing that. That is, can a dataset drawn using multiple posterior samples replace a dataset drawn using only a single posterior sample, if we draw 100 such datasets to compare against a given dataset via their summary statistics?