Brms / tidybayes predicted values

I think I have a decent grasp of the difference between fitted and predicted values. However, I am slightly confused about how the uncertainty in the posterior is incorporated into the posterior predicted values.

If I use the function add_predicted_draws(), I can specify ‘n’ for the number of predictions I wish to be made - so if I am doing a posterior predictive check using my original data as input, I can set it to ‘1’ to get one prediction per participant in the original data set.

n = The number of draws per prediction / fit to return, or NULL to return all draws.

If set it to NULL it makes ‘all’ possible predictions - which seems to be 1 prediction for every iteration in the posterior. If I set the number to 1, or 10, for example, does the function randomly select 1 or 10 iterations to make predictions from, or for example just use the first 10 iterations, or something else?

Finally, if I set it to NULL, can I consider the resulting predictions to be a full ‘posterior distribution of predicted values’?

Hi @JimBob, hopefully this will help:

If I set the number to 1, or 10, for example, does the function randomly select 1 or 10 iterations to make predictions from, or for example just use the first 10 iterations, or something else?

Randomly. Try running the function twice with n=1 to verify. This is also suggested in the function’s documentation, because you can set the seed number for random draws.

Finally, if I set it to NULL, can I consider the resulting predictions to be a full ‘posterior distribution of predicted values’?

There will be more samples of predicted values, but whether that constitutes the “full” distribution is debatable and potentially confusing.

Thanks for your response - that’s very helpful. Do you know if the sampling is with or without replacement? Because if it is without, then although it might be mistaken to say the ‘full’ distribution, I guess what I mean is one set of predictions for every iteration in the posterior samples. For example, when my model has 24000 posterior samples/24000 iterations, if I leave n at NULL, then it makes 24000 sets of predictions. So my sense would be that if it is sampling without replacement then these predictions would be reflective of all the posterior samples.