Question regarding posterior_predict used on new observations with new covariate levels

I am trying to understand what posterior_predict does for a fitted brms object on new levels of a group level effect.

Consider the following dataset:


dat = Salaries[,c("salary","discipline","sex","rank")]

dat$rank = as.character(dat$rank)
dat$discipline = as.character(dat$discipline)
dat$sex = as.character(dat$sex)

# the test data with new levels for the covariate rank
test_dat = expand.grid(rank=paste0(1,"_A"),

New datapoint test_dat looks like this:

rank discipline sex
1 1_A B Female

I fit the following model below and use posterior_predict, while having the setting allow_new_levels=TRUE.

brms_model = brm(formula = salary ~ discipline + sex + (1|rank),
                 data = dat,
                 prior = c(
                   prior(normal(0, 100000), class=b),
                   prior(normal(0,100000), class=Intercept),
                   prior(normal(0, 100000), class = sd),
                   prior(normal(0, 100000), class = sigma)
                 chains = 1,
                 cores = 1,
                 control = list(adapt_delta = 0.99,max_treedepth=15))

# get posterior predictive samples for test_dat
test_dat_ppd = posterior_predict(brms_model,
                                 allow_new_levels=TRUE) # the number of posterior predictive samples

How are 1000 posterior predictive samples actually retrieved for test_dat when the level 1_A is not in the sample dat? From reading the info in extract_draws.brmsfit , I think it proceeds like the following below:

Let p_asstprof, p_assocprof, p_prof be the empirical probabilities of the three factors in the dat$rank:

for (i in 1:1000) {
   sample r ~ Categorial(p_asstprof, p_assocprof, p_prof)
   get 1 posterior predictive sample for c(r, B, Female)

Is this correct?

  • Operating System: 10.14.6
  • brms Version: 2.12.0

@paul.buerkner will know.

1 Like