Confusion on difference between posterior_epred() and posterior_predict() in a mixed effects modelling context

The distinction between epred and predict is always whether you are talking about the distribution (uncertainty) of individual cases (predict) or the average/expectation (epred). Eg, in the context of a single-level normal regression model, does the distribution include just the uncertainty in the mean/average/expectation (epred) or also the individual-level variation (sigma) and its uncertainty (predict)?

In a mixed effects model, there is the complication that we have both individuals and groups. The choice of epred vs predict only concerns “individuals vs average/expectation”. It has nothing to do with groups or group effects. predict there concerns the distribution of individuals within a group. epred concerns uncertainty in the average of a group.

How you are handling group-level parameters doesn’t involve choice between these functions. That is controlled by the re_formula argument. If you want to know uncertainty in existing-group-A’s mean, then epred with re_formula = NULL (include all estimated random effects estimated from data from group A). If you want to know uncertainty in existing-group-A individuals, then predict with re_formula = NULL (include all estimated random effects estimated from data from group A). If you want to know uncertainty in the average of a new group you dot have any data on, then set the group variables to new values and use epred with re_formula = NULL, which will draw random values for the group parameters and make the distribution wider considering we don’t have any data on the group mean. If you want to know uncertainty in individuals in that new group, predict with a new group value and re_formula = NULL. Again, re_formula is controlling what group parameters data we are using/conditioning on to get group-level uncertainty; epred vs predict is determining whether we want uncertainty in the group average vs individual cases.

If we want uncertainty in the population average, epred with re_formula = NA. This fixes all of the group parameters to 0. predict with re_formula = NA doesn’t make much sense to me. That would describe uncertainty in individuals within a group whose parameters are all exactly zero. That’s not a useful estimate IMO.

At a purely computational level, with a normal likelihood, epred vs predict is just a difference whether sigma and its uncertainty is included in the distribution vs not. Random effects handling doesn’t differ between the two. That’s the re_formula argument.

“predict” has little if anything to do with “new observations” vs “observed data”. Both epred and predict describe post-data, model-implied uncertainty. They are model implied uncertainty in the average (epred) or individual cases (predict). The same applies in frequentist and Bayesian modeling.

5 Likes