Specify priors with consistent implications for multiple factors using brms

I’m working on analyzing some experimental ecological data, where some demographic rate (y) is expected to vary by species and experimental treatment. Let’s say that I have some prior knowledge about the range of y that I would like to use to specify weakly informative priors, but that this knowledge is agnostic to species or treatment. As such, I would like to set priors that yield the same (or similar) distribution of prior expectations for any given combination of species and treatment. Perhaps I’m missing something obvious, but I’ve been struggling to achieve this using the R formula syntax in brms.

Here is some R code to generate synthetic data and simulate prior expectations:

library(brms)
library(tidybayes)
library(ggplot2)

set.seed(505)

# Specify study design
n <- 1000
n_species <- 8
n_treatment <- 2

# Specify coefficients
Intercept <- 0
b_species <- c(0, rnorm(n_species - 1, 0, 1))
b_treatment <- c(0, rnorm(n_treatment - 1, -0.3, 0.5))

# Simulate data
species <- sample(seq(1:n_species), n, replace = TRUE)
treatment <- sample(seq(1:n_treatment), n, replace = TRUE)
df <- data.frame(species, treatment)
logit_mu <- apply(df, 1, function(x) Intercept + b_species[x[1]] + b_treatment[x[2]])
mu <- inv_logit_scaled(logit_mu)
trials <- 30
y <- rbinom(n, trials, mu)
df$species <- as.factor(df$species)
df$treatment <- as.factor(df$treatment)
df$y <- y
df$trials <- trials

# Inspect simulated data
ggplot(df, aes(x = species, y = y, fill = treatment)) +
  geom_boxplot() +
  theme_bw()

# Specify model formula
form <- bf(
  y | trials(trials) ~ species*treatment,
  family = binomial()
)

# Specify priors
priors <- c(
  set_prior("normal(0, 1.5)", class = "Intercept"),
  set_prior("normal(0, 1.5)", class = "b")
)

# Sample from priors
fit_prior <- brm(
  form,
  data = df,
  prior = priors,
  sample_prior = "only"
)

# Prepare new data for prediction
newdf <- expand.grid(
  species = levels(df$species),
  treatment = levels(df$treatment),
  trials = 1
)

# Plot prior expectations
newdf |>
  add_epred_draws(fit_prior) |>
  ggplot(aes(x = .epred, color = species)) +
  geom_density() +
  facet_wrap(vars(treatment), labeller = "label_both") +
  xlab("y") +
  theme_bw()

Simulated data:

Prior expectations:

Although the prior means are the same for all species and treatments (because that is how I specified the priors), the distributions of prior expectations are quite different. I take it that this is a consequence of how species 1 with treatment 1 is being used as the reference level, with each successive coefficient adding further spread around the prior mean. This might be especially stark here because of the logit link function.

Removing the intercept (y ~ 0 + species*treatment) can make it so that prior expectations are the same for all species in treatment 1, but the deviations will still persist in treatment 2. I could achieve the desired consistency of prior implications by making the priors for b really narrow, but this doesn’t seem right considering the meaning of those coefficients.

I think my main issue is that these factor levels are essentially exchangeable indices (?), so I would expect them to yield the same expectations. Does what I’m trying to achieve here make sense? Does anyone have recommendations for how to specify priors for multiple categorical predictors to this end? Or perhaps how to better think about the Intercept and b terms in brms?

System info:

  • Operating System: macOS Ventura 13.4
  • brms Version: 2.20.4

The issue your are describing arises because you are adding up a different number of effects to obtain the posterior means of the different conditions. Likely the factors your are entering as predictor are coded as treatment contrasts. You can check these by evaluating contrasts(species)and contrasts(treatment). If one level, likely the first factor level, is used as the reference, that is coded with 0 for all contrasts these are treatment contrasts.

The package ‘bayestestR’ has a designated functions to address this problem ‘contr.equalprior’: contr.equalprior function - RDocumentation By assigning these contrasts to all your factors you should circumvent the problem.

# set priors for consistent prior distributions across all factor levels
contrasts(species) <- bayestestR::contr.equalprior
contrasts(treatment) <- bayestestR::contr.equalprior

Then reestimate the models and see if the problem is resolved.

Thank you - I wasn’t aware of the contr.equalprior option in “bayestestR”. This does indeed allow me to set equivalent priors on all of the factor levels/combinations! I have a couple follow-up questions/observations though:

TLDR:

  1. How might I avoid setting narrow priors for b when I’m already setting a relatively wide prior for Intercept?

  2. How does one interpret model coefficients when using contr.equalprior? These coefficients seem difficult to interpret upon some preliminary exploration, and I’m uncertain about how to think about the priors for them.

  3. Is there a way to directly set priors on the means of multiple categorical predictors in “brms”?


(1) Even with contrasts set to contr.equalprior, with the population-level Intercept present, I still need to tighten the prior for b more than I feel makes sense in order to get what I’m aiming for. That is, with Intercept ~ normal(0, 1.5) and b ~ normal(0, 1.5), the prior expectations come out looking similar to what I had for species 2-8 in treatment 1 from the figure from above, as opposed to the flatter prior expectation for species 1. Does this also arises from adding together Intercept and b? I know I can adjust the priors, but I’m trying to understand what causes this behavior.

(2) With contr.equalprior, how exactly do I interpret the coefficients that I’m estimating/setting priors for? I fit the following model to the simulated data and tried digging into the summary table, but I’m finding it difficult to interpret the coefficients.

Model:

form <- bf(
  y | trials(trials) ~ species*treatment,
  family = binomial()
)

priors <- c(
  set_prior("normal(0, 1.5)", class = "Intercept"),
  set_prior("normal(0, 0.5)", class = "b")
)

fit <- brm(form, data = df, prior = priors)

Summary table:

summary(fit)
                    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept              -1.04      0.01    -1.06    -1.01 1.00     7471     3321
species1                0.41      0.05     0.31     0.50 1.00     6689     3259
species2                0.69      0.04     0.61     0.78 1.00     7461     3115
species3                0.96      0.03     0.90     1.03 1.00     8740     3172
species4               -0.55      0.04    -0.62    -0.48 1.00     8949     2898
species5               -0.99      0.04    -1.07    -0.91 1.00     6582     3539
species6               -0.09      0.04    -0.15    -0.02 1.00     9460     2772
species7               -0.04      0.04    -0.11     0.04 1.00     7644     2999
treatment1             -0.16      0.02    -0.21    -0.12 1.00     7560     3122
species1:treatment1     0.02      0.07    -0.11     0.16 1.00     7209     3451
species2:treatment1    -0.02      0.06    -0.14     0.09 1.00     8887     3448
species3:treatment1    -0.05      0.05    -0.14     0.04 1.00     9065     3089
species4:treatment1     0.05      0.05    -0.06     0.15 1.00     9894     3322
species5:treatment1    -0.01      0.06    -0.12     0.11 1.00     6566     3274
species6:treatment1    -0.01      0.05    -0.11     0.08 1.00     8721     3429
species7:treatment1    -0.04      0.05    -0.14     0.06 1.00     8195     2730

Compare this to the point estimate table:

bayestestR::point_estimate(emmeans::emmeans(fit, ~ treatment | species))
Parameter | Median |  Mean |   MAP
----------------------------------
1 1       |   0.02 |  0.02 |  0.02
2 1       |  -0.29 | -0.29 | -0.29
1 2       |  -1.13 | -1.13 | -1.15
2 2       |  -1.35 | -1.35 | -1.36
1 3       |  -1.34 | -1.34 | -1.34
2 3       |  -1.52 | -1.52 | -1.53
1 4       |  -2.09 | -2.09 | -2.08
2 4       |  -2.33 | -2.33 | -2.34
1 5       |  -1.07 | -1.07 | -1.07
2 5       |  -1.23 | -1.23 | -1.24
1 6       |  -0.48 | -0.48 | -0.48
2 6       |  -0.77 | -0.77 | -0.77
1 7       |  -0.38 | -0.38 | -0.38
2 7       |  -0.61 | -0.61 | -0.61
1 8       |  -0.88 | -0.88 | -0.89
2 8       |  -1.12 | -1.12 | -1.12

I can see that the point estimate table aligns well with the simulated data, but how are the coefficients used to get these point estimates? In an attempt to understand the mechanics, I tried coding up a function that (I think) manually does this:

point_estimate_manual <- function(fit, species, treatment) {
  # Contrast matrices
  c_sp <- contrasts(fit$data$species)
  c_trt <- contrasts(fit$data$treatment)
  # Coefficients
  post <- as.data.frame(as_draws_df(fit))
  b_Intercept <- mean(post$b_Intercept)
  b_sp <- sapply(post[,paste0("b_species", 1:ncol(c_sp))], mean)
  b_trt <- mean(post[,paste0("b_treatment", 1:ncol(c_trt))])
  b_sp_trt <- sapply(post[,paste0("b_species", 1:ncol(c_sp), ":treatment", 1:ncol(c_trt))], mean)
  # Compute expectation
  E <-
    b_Intercept +
    c_sp[species,] %*% b_sp +
    c_trt[treatment,] %*% b_trt +
    (c_sp[species,] * c_trt[treatment,]) %*% b_sp_trt
  E <- round(E, 2)
}

newdf2 <- expand.grid(treatment = 1:2, species = 1:8)

E <- apply(newdf2, 1, function(x) point_estimate_manual(fit, x[2], x[1]))

cbind(newdf2, Mean = E)
   treatment species  Mean
1          1       1  0.02
2          2       1 -0.29
3          1       2 -1.13
4          2       2 -1.35
5          1       3 -1.34
6          2       3 -1.52
7          1       4 -2.09
8          2       4 -2.33
9          1       5 -1.07
10         2       5 -1.23
11         1       6 -0.48
12         2       6 -0.77
13         1       7 -0.38
14         2       7 -0.61
15         1       8 -0.88
16         2       8 -1.12

This seems to do the job, but having gone through all of that, it feels exceedingly difficult to interpret the coefficients when contrasts are coded in this way. If so, should I adjust my expectations and rely on things like posterior predictions and “emmeans” for my inferential needs? Or am I vastly overcomplicating things here?

(3) I think what I would like to do is directly set priors on and estimate the means of all species by treatment combinations (i.e. the values shown in the point estimate tables). I suppose I could do this with Stan code by indexing an array of parameters for categorical predictors (e.g. array[8, 2] int b; to_array_1d(b) ~ normal(0, 1.5)), but I currently have a workflow that uses “brms”, so was hoping for a way to do this within the package.

Apologies for the lengthy post - answers or recommendations for any of the questions I listed here would be much appreciated.

Regarding your follow up questions:

  1. As long as you are using the combination of an Intercept (i.e. the estimated values for all predictors at level 0) and effects (i.e. the contrasts coded in the contrast matrix) the priors of the means of conditions that are computed as the sum of the intercept and effect and effect will always have wider priors. Specifically, this arises because of adding up normally distributed random variables: Sum of normally distributed random variables - Wikipedia

In your case the SD of the Intercept prior is 1.5 and the SD of the b priors is 0.5 to the prior SD for any mean that is computed via the sum of the intercept and one b will be sqrt(1.5 + 0.5). If some means are sums of the Intercept plus 2 effects b the SD would be sqrt(1.5 + 0.5 + 0.5) This is likely what you observed in the simulation of your original post.

  1. The interpretation of the coefficients is indeed less straightforward with contr.equalprior. The Intercept basically represents the grand mean across all factor combinations. So, if you would just compute the mean of all data irrespective of the condition they belon to this should be matching ?inv_logit_scaled(Intercept). The b` coefficients can be interpreted as weighted differences between different conditions, for which the weights are the values in the contrast matrix.

If you were interested in directly estimating all means (also as a response to 3) you can set the formula to 0 + species:treatment. As you described in your original post this supresses the intercept, but insted of the formula using the * operator that essentially translates to 0 + species + species:treatment, you are estimating the means of all factorial combinations of species and treatment. The formula you specfieid using the * operator estimated the means for all species levels and then estimated and effect that captures the differences for each species between treatment 1 and 2. By using the : you “only” estimate the interaction and can obtain the means for all combinations directly. This way you should be able to directly specify a prior for all estimated means.

Essentially all these specification are re-parameterizations of the same models. So, the estimated parameters should practically be identically, if you have sufficient data to overwhelm the priors. So, the priors would mainly play a role in estimating BayesFactors.

I hope these clarifications help.

1 Like

These clarifications have been immensely helpful - thank you so much for the detailed walk-through. I’m especially glad to have learned about the behavior of the 0 + x1:x2 formula (clearly I need to do some more reading about R formulas), as I often find myself thinking about multiple interacting factors in this way. I suspect this might also be helpful for some folks coming from Statistical Rethinking, which I think uses this sort of index/mean coding extensively.