As part of understanding better the (Bayesian) stats I would like to replicate the outcome of the conditional_effects by hand (ggplot)
By using the following model
job::job({
fit3 <- brm(time | cens(censored) ~ age * sex + disease + (1|patient),
data = kidney, family = lognormal(),
cores = 4, chains = 4, seed = 60224)
})
I can get this plot by conditioning on disease == other
cond <- make_conditions(kidney, vars = c("disease")) %>% filter(disease == 'other')
p_null <- conditional_effects(fit3,
effects = 'age:sex',
conditions = cond,
re_formula = ~(1|patient))
Now I’m trying to replicate this plot by
- Creating a grid of data I want to plot (age and sex keeping constant disease == ‘orther’),
- Adding the epred drwas using re_formula = ~(1|patient) (as before),
- Ploting it
The data grid:
## Function to create a sequence from min to max based on a variable in a data.frame
seqf <- function(x, var , length.out = 21){
seq(min(x[, var]),
max(x[, var]),
length.out = 21
)
}
dat_grid1 <- expand_grid(
age = seqf(x=kidney, var= 'age'),
sex = kidney$sex %>% unique %>% sort,
disease = 'other',
patient = kidney$patient %>% unique
)
The epred draws and the plot
dat_grid1 %>%
add_epred_draws(fit3, re_formula = ~(1|patient), allow_new_levels = F) %>%
ggplot(aes(x = age ,
y = .epred,
color = sex, group = sex, fill = sex)) +
stat_lineribbon(.width = 0.0, alpha = 1) +
stat_ribbon(.width = 0.75, alpha = .3) +
theme_bw()
Now, something is wrong, the outcomes are different and “more” annoyingly my plot take 5 times more time to be generated.
Of course,
My data has ~1600 rows,
The data from conditional_effects is like 300 rows - plot(p_null, plot = F)[[1]]$data
but I don’t undestand how it can accept the argument re_formula = ~(1|patient)
when in the data the patient level has NA