Just following up with my currently hacky and ugly solution that I’m using for now. I’ll probably need to fix it up at a later point to accommodate more complicated scenarios (at the moment, it only handles one random effect for example). But hopefully this helps someone else out there, or at least makes it easier to point to a more established and robust solution to this task.
I’m modelling a pharmacokinetic curve over time, so I have about 20 values within each ID over time. So I first create modelmat
, which contains one row for each ID, from my modeldat
with all the data. This is because all my fixed effects are at an ID level - i.e. they don’t differ within any given ID.
modelmat <- modeldat %>%
filter(!duplicated(ID)) %>%
mutate(IDno = 1:n())
Then I want to extract summaries of my parameters for each random ID, after accounting for the fixed effects. I do so with a series of functions. First, the helper functions:
# Get the fixed effects formula
formula_fix <- function(object, parno) {
formula <- paste0("~ ", as.character(object$formula$pforms[[parno]][3]))
as.formula(stringr::str_remove(formula, " ?\\+ ?\\(.*\\)"))
}
# Get the random effect identifier
formula_ran <- function(object, parno) {
formula <- paste0("~ ", as.character(object$formula$pforms[[parno]][3]))
stringr::str_match(formula, "\\(.*\\| ?(.*)\\)")[2]
}
# Get the name of a fixed effect parameter from its number
parname_get <- function(object, parno) {
as.character(object$formula$pforms[[parno]][2])
}
# Get the number of a fixed effect parameter from its number
parno_get <- function(object, parname) {
which(names(object$formula$pforms)==parname)
}
# Get the samples for an individual parameter from an individual curve
samples_par_i <- function(drawsdf_fixpar, drawsdf_ranpar, modelmatrix, IDno, parname) {
covar <- modelmatrix[IDno,]
covar_mat <- matrix(rep(covar, each=nrow(drawsdf_fixpar)), ncol=length(covar))
fsamples <- apply(covar_mat * drawsdf_fixpar, 1, sum)
rsamples <- select(drawsdf_ranpar, contains(paste0(parname, "[", IDno, ",")))
as.numeric(unname(unlist(fsamples+rsamples)))
}
# Summarise the samples for a single individual and a single parameter
summarise_par_indiv <- function(drawsdf_fixpar, drawsdf_ranpar, modelmatrix, IDno, parname) {
samples <- samples_par_i(drawsdf_fixpar, drawsdf_ranpar, modelmatrix, IDno, parname)
tibble(
Estimate = mean(samples),
Est.Error = sd(samples),
"l-95% CI" = quantile(samples, probs = 0.025),
"u-95% CI" = quantile(samples, probs = 0.975),
)
}
# Summarise the samples for all individuals for a particular parameter
summarise_par_group <- function(object, modelmat, parname) {
parno <- parno_get(object, parname)
modelmatrix <- model.matrix(formula_fix(object, parno), modelmat)
ran_ef <- formula_ran(object, parno)
draws <- posterior::as_draws(object$fit)
drawsdf <- posterior::as_draws_df(draws)
drawsdf_fixpar <- drawsdf %>%
select(contains(paste0("b_", parname)))
drawsdf_ranpar <- drawsdf %>%
select(starts_with(paste0("r_",ran_ef,"__", parname)))
as_tibble(modelmatrix) %>%
mutate(IDno = 1:n(),
Parameter = parname) %>%
select(IDno, Parameter, everything()) %>%
group_by(IDno) %>%
mutate(Summary = map(IDno, ~summarise_par_indiv(drawsdf_fixpar, drawsdf_ranpar,
modelmatrix, .x, parname))) %>%
ungroup() %>%
unnest(Summary)
}
And then the main function using the helper functions, where object is the brms output, and modelmat is described above.
ranef_preds <- function(object, modelmat) {
draws <- as_draws(object$fit)
drawsdf <- posterior::as_draws_df(draws)
predprep <- brms::prepare_predictions(object)
parnames <- names(predprep$nlpars)
parnos <- map_dbl(parnames, ~parno_get(object, .x))
bind_rows( map(parnames, ~summarise_par_group(object, modelmat, .x)) )
}
And this gives me a summary that looks like this
> summarydat[245:255,]
# A tibble: 11 x 8
IDno Parameter `(Intercept)` GroupGroup2 Estimate Est.Error `l-95% CI`
<int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 45 logbpnd 1 0 0.904 0.0250 0.850
2 46 logbpnd 1 0 0.892 0.0257 0.833
3 47 logbpnd 1 0 0.906 0.0214 0.863
4 48 logbpnd 1 0 0.908 0.0252 0.860
5 49 logbpnd 1 0 0.888 0.0281 0.823
6 50 logbpnd 1 0 0.904 0.0246 0.855
7 51 logbpnd 1 1 1.00 0.0255 0.958
8 52 logbpnd 1 1 1.01 0.0281 0.961
9 53 logbpnd 1 1 1.00 0.0229 0.961
10 54 logbpnd 1 1 0.987 0.0255 0.930
11 55 logbpnd 1 1 1.00 0.0243 0.956
# … with 1 more variable: `u-95% CI` <dbl>
where logbpnd is one of my parameters (like A,B,C etc in the example above) that I’ve simulated to show group differences. So this gives me the outcomes to compare to what I would have got from the output of the nls() model, after accounting for the effects of my fixed effects in my hierarchical model.
So, about this implementation: it’s slow, and it’s probably quite fragile, and it only handles one random effect at a time. So it’s currently pretty hacky, but if it helps anyone, then I’d be delighted. Also, if this makes it easier to point to an existing function which does the same thing better, that’d be a great outcome too :).