Specifying formulates for interaction between categorical variables with the index coding approach in brms

Hi,
I want to use the index coding approach with brms, but I wonder if I have applied and understood it correctly. Most examples illustrate how to apply the case of one categorical factor with multiple levels, but in my case, I have two factors with several levels, and I want to look at the interactions between them. Each participant also has multiple observations at each level of the factors. I have specified three models that I want to compare:

modA <- brm(data = d, 
      family = gaussian,
      performance ~ 0 + course + (0 + course | bib),
      prior = c(prior(normal(0, 0.5), class = b),
                prior(student_t(3, 0, 2.5), class = sigma),
                prior(student_t(3, 0, 2.5), class = sd),
                prior(lkj(2), class=cor)),
                control = list(adapt_delta = 0.95),
                file = "modA_test",
      iter=4000, cores = 4, seed = 1337)

modB <- brm(data = d, 
      family = gaussian,
      performance ~ 0 + course + day + (0 + course + day | bib),
      prior = c(prior(normal(0, 0.5), class = b),
                prior(student_t(3, 0, 2.5), class = sigma),
                prior(student_t(3, 0, 2.5), class = sd),
                prior(lkj(2), class=cor)),
                control = list(adapt_delta = 0.95),
                file = "modB_test",
      iter=4000, cores = 4, seed = 1337)


modC <- brm(data = d, 
      family = gaussian,
      performance ~ 0 + course + day + course:day + (0 + course + day + course:day | bib),
      prior = c(prior(normal(0, 0.5), class = b),
                prior(student_t(3, 0, 2.5), class = sigma),
                prior(student_t(3, 0, 2.5), class = sd),
                prior(lkj(2), class=cor)),
                control = list(adapt_delta = 0.95),
                file = "modC_test",
      iter=4000, cores = 4, seed = 1337)

In modA, I want to examine the difference between the courses, ignoring information about the Day factor.

In modB, I want to explore the estimated change/improvement from day 1 to day 5.

In modC, I want to understand if the changes were different in the three courses.

Given my goal, are the model formulas correctly specified? Or do I have to use a use brms non-linear syntax? In this book, 8 Conditional Manatees | Statistical rethinking with brms, ggplot2, and the tidyverse: Second edition, it seems that it is only necessary to use the non-linear syntax when you have an interaction between a categorical factor and a continuous variable. Please correct me if I am wrong.

For the interaction model, try this instead:

performance ~ 0 + course:day + (0 + course:day | bib)

As to the second model, I’m not aware of a way to get brm() to do what you want it to do without the non-linear syntax. Here’s what that could look like for your use case:

bf(performance ~ 0 + c + d 
   c ~ 0 + course + (0 + course |i| bib), 
   d ~ 0 + day + (0 + day |i| bib),
   nl = TRUE)

Thanks @Solomon. So I don’t need the main effects of course and day included in the interaction model? I was a bit surprised to learn that. The other models seems to work fine. Thanks.

To my mind, McElreath’s index approach to interaction models avoids concepts like “main effects.” Rather, his approach simply returns the mean for each group.

1 Like

Thank you. It takes some time to consolidate his approach :)

1 Like

I guess you also could extend the non-linear syntax to include more complex interaction models, such as:

 bf(performance ~ 0 + a + b * mTime, 
             a ~ 0 + course:day + (0 + course:day |i| bib), 
             b ~ 0 + course:day + (0 + course:day |i| bib),
             nl = TRUE)

mTime is a continuous variable in this case.

Maybe. I should confess I’ve only gone so far with the non-linear syntax. Tread with care.

1 Like