I am trying to understand what posterior_predict
does for a fitted brms object on new levels of a group level effect.
Consider the following dataset:
library(car)
library(brms)
dat = Salaries[,c("salary","discipline","sex","rank")]
dat$rank = as.character(dat$rank)
dat$discipline = as.character(dat$discipline)
dat$sex = as.character(dat$sex)
# the test data with new levels for the covariate rank
test_dat = expand.grid(rank=paste0(1,"_A"),
discipline="B",
sex="Female")
New datapoint test_dat
looks like this:
rank discipline sex
1 1_A B Female
I fit the following model below and use posterior_predict
, while having the setting allow_new_levels=TRUE
.
brms_model = brm(formula = salary ~ discipline + sex + (1|rank),
data = dat,
prior = c(
prior(normal(0, 100000), class=b),
prior(normal(0,100000), class=Intercept),
prior(normal(0, 100000), class = sd),
prior(normal(0, 100000), class = sigma)
),
chains = 1,
cores = 1,
control = list(adapt_delta = 0.99,max_treedepth=15))
# get posterior predictive samples for test_dat
test_dat_ppd = posterior_predict(brms_model,
newdata=test_dat,
summary=FALSE,
nsamples=1000,
allow_new_levels=TRUE) # the number of posterior predictive samples
How are 1000 posterior predictive samples actually retrieved for test_dat
when the level 1_A
is not in the sample dat
? From reading the info in extract_draws.brmsfit
, I think it proceeds like the following below:
Let p_asstprof, p_assocprof, p_prof be the empirical probabilities of the three factors in the dat$rank
:
for (i in 1:1000) {
sample r ~ Categorial(p_asstprof, p_assocprof, p_prof)
get 1 posterior predictive sample for c(r, B, Female)
}
Is this correct?
- Operating System: 10.14.6
- brms Version: 2.12.0