Given a hierarchical model, I would like to get posterior predictions for *new* subjects based on how much they deviate from the central tendency.

For example, I fitted a hierarchical model to the `sleepstudy`

dataset:

```
library(tidyverse)
library(brms)
data = read.csv('https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/lme4/sleepstudy.csv') %>%
as_tibble(data) %>%
mutate(Subject = factor(Subject))
m <- brm(Reaction ~ Days + (Days | Subject),
data = data)
```

The model summary is:

```
Family: gaussian
Links: mu = identity; sigma = identity
Formula: Reaction ~ Days + (Days | Subject)
Data: data (Number of observations: 180)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000
Group-Level Effects:
~Subject (Number of levels: 18)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 26.86 6.77 15.31 41.80 1.00 1842 2492
sd(Days) 6.58 1.53 4.17 9.98 1.00 1394 1715
cor(Intercept,Days) 0.08 0.30 -0.47 0.68 1.01 988 1534
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 251.38 7.43 236.79 266.35 1.00 1827 2620
Days 10.36 1.69 6.86 13.67 1.00 1502 1389
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 25.93 1.52 23.17 29.11 1.00 3481 2594
Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
```

Now, I would like to get posterior predictions for three, *new* prototypical subjects: the average subject and the two subjects that deviate the most (e.g., 2 standard deviations) from the central tendency in the negative and positive direction. For example, in the figure below, I color-coded the posterior prediction for each subject based on the average effect of `Days`

(the slope of the lines). The dashed line is the central tendency of the population. The average subject could be someone like `332`

, and the extreme subjects could be someone resembling `308`

and `335`

.

Is it possible to sample new subjects based on the random effect magnitude? If I understand things correctly, `brms`

already offers posterior predictions for the average subject (with `re_formula = NA`

) and I think a combination of `re_formula=NULL`

, `allow_new_levels=TRUE`

, and `sample_new_levels="gaussian"`

(for an explanation of these option see also Understanding sample_new_levels = "uncertainty", "gaussian", and "old_levels") could get me close to what I am trying to achieve. I am thinking something like

```
posterior_epred(m,
newdata = data.frame(Subject = "extreme_subject"),
allow_new_levels = TRUE,
sample_new_levels = "gaussian",
scale = 6.58*2 # sample new subject at 2 standard deviations (sd(Days) = 6.58)
)
```

where `scale`

is a hypothetical new option to control where to sample the new subject

First, does it make sense? And second, would you be able to help :)? I am sure there are already some established procedures I am not aware of yet. Thanks

- Operating System: Win 10
- brms Version: 2.13.3