Categorical interaction effects while using index notation in {brms} models

The problem

The popular book ‘Statistical Rethinking’ by Richard McElreath uses index variables (instead of indicator/dummy variables) during specification of Bayesian models. Solomon Kurz kindly translated the code for use with {brms}.

Although the aforementioned books cover how to use index variables for interactions with continuous terms, it is less clear how this would be achieved with two categorical variables. I know I should use the non-linear {brms} syntax, and I attempted three models below (both should be easily reproducible).

I am uncertain if it is reasonable to simply add the interaction term; furthermore, the additional terms in the output become a bit trickier to understand. I am not sure where to begin except by plotting.

Setup

Below is some fake simulated data creating a positive association between outcome by group, but a reversed relationship appears when examined by outcome, group and covar.

set.seed(1295)

# Libraries
library(dplyr)
library(ggplot2)
library(brms)

# Fake data where overall 1 group is + associated but by covar they have opposite
covar_vec <- c('Problem', 'NoProblem')
fake_data <- data.frame(
  group = factor(c(rep(1, 10), rep(2, 10))),
  covar = factor(c(rep(covar_vec[1], 5), rep(covar_vec[2], 5),
                   rep(covar_vec[1], 5), rep(covar_vec[2], 5))),
  outcome = factor(c(1,1,1,1,0,
                     0,0,0,1,1,
                     0,0,0,1,1,
                     1,1,1,0,0))
  )

Examine data relationship

Data shows the positive and negative assocations

# Plot to confirm relationship
fake_data_sum1 <- fake_data |> dplyr::group_by(covar, outcome) |> dplyr::summarize(n = dplyr::n())
fake_data_sum2 <- fake_data |> dplyr::group_by(group, outcome) |> dplyr::summarize(n = dplyr::n())
fake_data_sum3 <- fake_data |> dplyr::group_by(group, covar, outcome) |> dplyr::summarize(n = dplyr::n())

ggplot(fake_data_sum3) +
  geom_col(aes(x = group, y = n, fill = outcome)) +
  facet_wrap(~covar, nrow = 2) +
  theme_bw()

Models

Created models with no interaction and with an interaction using index variables. Priors are assumed to be flat/very weak on the outcome scale. The model without the interaction terms appears to only discover the linear relationship. Once group:covar is included, the reverse direction is seen. The second method for including interactive terms is very similar, though perhaps not identical. I am just unclear as to if this is an acceptable approach; the summary table becomes increasingly difficult to understand without plots (which to my understanding, is common for models like these…)

noInt <- brm(bf(outcome ~ 0 + a + b,
                a ~ 0 + group,
                b ~ 0 + covar,
                nl = TRUE),
             prior = c(prior(normal(0, 1.75), nlpar = a),
                       prior(normal(0, 1.75), nlpar = b)),
             data = fake_data, 
             family = bernoulli(link = 'logit'), 
             chains = 4, cores = 4)

wInt <- brm(bf(outcome ~ 0 + a + b + c,
               a ~ 0 + group,
               b ~ 0 + covar,
               c ~ 0 + group:covar,
               nl = TRUE),
            prior = c(prior(normal(0, 1.75), nlpar = a),
                      prior(normal(0, 1.75), nlpar = b),
                      prior(normal(0, 1.75), nlpar = c)),
            data = fake_data, 
            family = bernoulli(link = 'logit'), 
            chains = 4, cores = 4)

# Syntax similar to books... looks same as `:` terms
wInt2 <- brm(bf(outcome ~ 0 + a + b * covar,
               a ~ 0 + group,
               b ~ 0 + group,
               nl = TRUE),
            prior = c(prior(normal(0, 1.75), nlpar = a),
                      prior(normal(0, 1.75), nlpar = b)),
            data = fake_data, 
            family = bernoulli(link = 'logit'), 
            chains = 4, cores = 4)

# Plot outcomes
gridExtra::grid.arrange(
  plot(conditional_effects(wInt, 'group:covar'), plot = FALSE)[[1]] + ggtitle('w/Interaction'),
  plot(conditional_effects(wInt2, 'group:covar'), plot = FALSE)[[1]] + ggtitle('w/Interaction2'),
  plot(conditional_effects(noInt, 'group:covar'), plot = FALSE)[[1]] + ggtitle('w/oInteraction')
)

Summary tables


image

Conditional plots

System Info

  • Operating System: Win10
  • brms Version: 2.20.4

Hi @aobrien and welcome to the Stan forums!

What exactly would you want to do?

For example, if you want the estimated parameters to reflect the four means you can do outcome ~ 0 + group %in% covar. You can get quite far with just variations of the basic R model syntax including +, :, |, / etc. Take a look at ?formula.

Does that help?

Thanks @matti!

What I am trying to do is:

  1. Create a crossed interaction between two categorical index variables, that allow flexible direction in slopes, in a model that has as few easily interpretable parameters as possible. I understand how to do this with dummy variables, but it is trickier with index variables while using {brms} as I have to delve into nonlinear syntax to ensure the intercepts are treated appropriately.
  2. Understand differences in parameters among the options in {brms} for creating these terms, even though they seem to end up with similar conditional plots. Determine what approach is best in terms of easy of interpretation for crossed factor interactions.

The %in% notation appears to be a nesting (a + a:b), not the full crossing (a + b + a:b), for the interaction. Is there a certain reason you’d recommended using %in% over * in this case?

The two methods (and the %in% version) had similar results but different parameters, but I suspect the second one is easier to understand (see below):?

# Model 1
outcome ~ 0 + a + b + c,
 a ~ 0 + group,
 b ~ 0 + covar,
 c ~ 0 + group:covar

Model 1 (wInt) has double the parameters compared to model 2, with similar plots. But in this case each interaction is separate. Although I can guess at how to combine the log odds, I am confused as to their individual parameter interpretations (does b_covarNoProblem have any particular meaning by itself or has to be in combination with others like a_group1 and c_group1:covarNoProblem?).

# Model 2
outcome ~ 0 + a + b * covar,
 a ~ 0 + group,
 b ~ 0 + group

My (possibly wrong) interpretation for the 4 possible combinations in Model 2 (wInt2), at their mean value, is:

logodds_Grp1NoProb = -0.08
logodds_Grp1Prob = -0.08 + 1.22 = 1.14
logodds_Grp2NoProb = 0.25
logodds_Grp2Prob = 0.25 - 0.54 = -0.29

Where each a_* value is a group at the baseline covar and adding b_* is the unique increase for each group at the next covar factor level.

All of this is to say, although I appear to capture the pattern in these models, I am unclear in how the parameterization of them affects how I should interpret the Estimates.

Although it doesn’t answer the interpretation question between Model 1 and Model 2, I did stumble across this thread by @Solomon, which provides input into how this interaction could be specified: How to properly compare interacting levels - #6 by Solomon

For my example this would be outlined as below; it appears to be equivalent to Model 1.

outcome  ~ 0 + a + b + c,
         a ~ 0 + group,
         b ~ 0 + covar,
         c ~ (0 + group) : (0 + covar)
1 Like

Sorry I think I just don’t understand. Maybe someone else can answer :/

fwiw if you have four cells: G1C1, G1C2, G2C1, & G2C2, then the “as few as easily interepretable parameters” could be the four means I posted above. The upshot is that its often easier to put priors on the means than their differences, and you can calculate any contrast after fitting the model.