Getting all comparisons without releveling factors and refitting the model

In categorical predictors of more than two levels, the models takes one level as a reference and it compares the rest of the levels with the reference (thus showing N_of_levels - 1 posterior estimates in the end). Nevertheless, the output of the model does not show all possible comparisons of the levels within the categorical variable. One way around it is to relevel() the factor and refit the model. But if the models takes an hour to fit for example, this comes at computational cost. Is there another way around it which would not require refitting the model or re-leveling a factor?

Here is a toy example: I have one variable (group) with three levels (adults, adolescents and children) and I am interested in comparisons among them.

adults = rnorm(100, 0, 1)
adolescents = rnorm(100, 10, 1)
children = rnorm(100, 20, 1)

data = data.frame(values = c(adults, adolescents, children), group = rep(c("adults", "adolescents", "children"), each = 100))

brm1 = brm(values ~ group, data = data, family = gaussian)

When I run brm1, I get the following population-level effects:

              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept         9.82      0.10     9.63    10.02 1.00     3247     2374
groupadults      -9.67      0.15    -9.95    -9.39 1.00     3349     2755
groupchildren    10.08      0.15     9.79    10.37 1.00     3449     2857

Here the reference level is “adolescents” (because it’s alphabetically first). What’s missing from this table is the comparison (coefficient) between children and adults. To get this, I can relevel data$values in order to change the reference level and refit the model. Here, the model is super fast, so that’s not a problem. But if the model takes an hour to fit, is there a way to see the children/adults coefficient of comparison without refitting the model?

Thank you!

  • Operating System: macOS Mojave 10.14.4
  • brms Version: 2.12.0

Hello! You could look extract the posterior samples of the parameters of interest, and take the difference between them yourself:

adult_children_diff <- samples$b_groupadults - samples$b_groupchildren

some_summary <- function(x) {
    posterior_mean = mean(x),
    posterior_variance = var(x),
    lower_95_ci = quantile(x, 0.025, names = FALSE),
    upper_95_ci = quantile(x, 0.025, names = FALSE)


but I can’t remember if this is a valid thing to do without changing the reference level (I think it is in this instance?). Depending on the complexity of your actual example, and the desired outputs (plots / summary statistics) this might be a bit manual.

Here are a few ways to handle this issue. The examples are taken from chapter 5 of the first edition of McElreath’s text.