# Specifying formulates for interaction between categorical variables with the index coding approach in brms

Hi,
I want to use the index coding approach with brms, but I wonder if I have applied and understood it correctly. Most examples illustrate how to apply the case of one categorical factor with multiple levels, but in my case, I have two factors with several levels, and I want to look at the interactions between them. Each participant also has multiple observations at each level of the factors. I have specified three models that I want to compare:

``````modA <- brm(data = d,
family = gaussian,
performance ~ 0 + course + (0 + course | bib),
prior = c(prior(normal(0, 0.5), class = b),
prior(student_t(3, 0, 2.5), class = sigma),
prior(student_t(3, 0, 2.5), class = sd),
prior(lkj(2), class=cor)),
file = "modA_test",
iter=4000, cores = 4, seed = 1337)

modB <- brm(data = d,
family = gaussian,
performance ~ 0 + course + day + (0 + course + day | bib),
prior = c(prior(normal(0, 0.5), class = b),
prior(student_t(3, 0, 2.5), class = sigma),
prior(student_t(3, 0, 2.5), class = sd),
prior(lkj(2), class=cor)),
file = "modB_test",
iter=4000, cores = 4, seed = 1337)

modC <- brm(data = d,
family = gaussian,
performance ~ 0 + course + day + course:day + (0 + course + day + course:day | bib),
prior = c(prior(normal(0, 0.5), class = b),
prior(student_t(3, 0, 2.5), class = sigma),
prior(student_t(3, 0, 2.5), class = sd),
prior(lkj(2), class=cor)),
file = "modC_test",
iter=4000, cores = 4, seed = 1337)
``````

In modA, I want to examine the difference between the courses, ignoring information about the Day factor.

In modB, I want to explore the estimated change/improvement from day 1 to day 5.

In modC, I want to understand if the changes were different in the three courses.

Given my goal, are the model formulas correctly specified? Or do I have to use a use brms non-linear syntax? In this book, 8 Conditional Manatees | Statistical rethinking with brms, ggplot2, and the tidyverse: Second edition, it seems that it is only necessary to use the non-linear syntax when you have an interaction between a categorical factor and a continuous variable. Please correct me if I am wrong.

For the interaction model, try this instead:

``````performance ~ 0 + course:day + (0 + course:day | bib)
``````

As to the second model, Iâ€™m not aware of a way to get `brm()` to do what you want it to do without the non-linear syntax. Hereâ€™s what that could look like for your use case:

``````bf(performance ~ 0 + c + d
c ~ 0 + course + (0 + course |i| bib),
d ~ 0 + day + (0 + day |i| bib),
nl = TRUE)
``````

Thanks @Solomon. So I donâ€™t need the main effects of course and day included in the interaction model? I was a bit surprised to learn that. The other models seems to work fine. Thanks.

To my mind, McElreathâ€™s index approach to interaction models avoids concepts like â€śmain effects.â€ť Rather, his approach simply returns the mean for each group.

1 Like

Thank you. It takes some time to consolidate his approach :)

1 Like

I guess you also could extend the non-linear syntax to include more complex interaction models, such as:

`````` bf(performance ~ 0 + a + b * mTime,
a ~ 0 + course:day + (0 + course:day |i| bib),
b ~ 0 + course:day + (0 + course:day |i| bib),
nl = TRUE)
``````

mTime is a continuous variable in this case.

Maybe. I should confess Iâ€™ve only gone so far with the non-linear syntax. Tread with care.

1 Like