Building a multilevel GAM model with multiple factors in brms


I have a large local positioning (LPS) dataset from a motor learning experiment with elite ski racers learning to pump in slalom. In the experiment, we instructed the skiers to ski straight down the hill (COURSE SG), and we wanted to compare their performance in three slalom courses relative to the condition when they skied straight down the hill. The details are not important here, but an interested reader may read the paper: Frontiers | Is there a contextual interference effect for sub-elite alpine ski racers learning complex skills?.

For the current analysis, I want to explore the kinematic changes that occurred from the pre-test (baseline) to the post-test (retention). Therefore, the first analysis will be a time analysis of the whole section of the slalom course, which I thought I could analyze with GAM models in brms. I, therefore, took a course on GAMs last week. However, I am unsure how to write the models in brms, so I hope someone here can help me get started. I have included two figures with different DVs.

I want a different smooth for each factor(course) and each factor(day), but it should also be possible to allow factor(course) and factor(day) to interact. Does this make sense, and is this possible? So far, I have the following model:

modA <- brm(data = df,
      family = gaussian,
      performance ~ 1 + course + s(sectionIndex, by =
course) + (1|bib),
      prior = c(prior(normal(0, 10), class = Intercept),
                prior(normal(0, 10), class = b),
                prior(student_t(3, 0, 5.9), class = sds),
                prior(exponential(1), class = sigma)),
      iter = 500, warmup = 300, chains = 2, cores = 2,
      seed = 4)

I have included a random intercept for each skier in the experiment. Each skier has three runs on baseline and retention.

Sorry that post got flagged. And welcome to the Stan forums. We’re not doing it manually—it’s the automated system.

Did you get a message that your post was flagged, and if so, can you please share it? I’m trying to figure out how to change the message because we were just informed it sounded like we were filtering these out manually.


1 Like

Thank you, Bob.

Yes, I think it was because I provided a link to a scientific paper that I have written and that describes the experimental protocol. It is completely redundant so I could be removed


I was struggling with a similar question and found this discussion for interacting categorical smooths (r - mgcv GAM: more than one variable in `by` argument (smooth varying by more than 1 factor) - Stack Overflow) using the interaction() function, which up until now I did not know existed. Seemed to work fine for my model, I haven’t found any apparent issues so far; it did in fact generate four distinct smooths from two binary predictors with interaction.

So perhaps try something like performance ~ 1 + course*day + s(sectionIndex, by = interaction(course,day)) + (1|bib), as the specification for the population smooth. I suspect in this case that you will also want subject-specific smooths, if you have observations across time for each subject. The subject-level intercepts will not handle variation in the smooths by subject.


Thank you @AWoodward . I didn’t know about the interaction() function. I need to Invest some time to understand it, but it seems to be what I am looking for. To get a subject-specific smooth in brms, you would suggest something like:

performance ~ 1 + course*day + s(sectionIndex, by = interaction(course,day)) + (1 + s(sectionIndex)   | bib)


performance ~ 1 + course*day + s(sectionIndex, by = interaction(course,day)) + (1 + s(sectionIndex) * course * day  | bib)

I guess the formula is not correct., but your answer pointed me in the right direction :)

Try this paper for various specifications of hierarchical smooth terms (Hierarchical generalized additive models in ecology: an introduction with mgcv [PeerJ]), it’s focussed on mgcv but the brms implementation is practically equivalent.


@AWoodward Thanks for your help. After some time thinking about my data, I think I will try.

brm(bf(performance ~ day + s(sectionIndex, by= day, k=50) + s(sectionIndex, bib, bs="fs", m=1 ) + ( 1 | bib) )) 

Notice I have ignored the Course factor this time. The courses in my experiment are really different, and I am not comparing them. I am only exploring the change that occured from baseline to retention in each of the course, which is information that is included in the day factor. So I fit a seperate model for each course.

I have tried to fit a GI model from the paper you suggested.

Does my formula look reasonable?

The participants (bib) came from different ski academies, and performed the experiment in different groups. Thus, they also experienced different snow conditions. Therefore I want to add another random intercept effect for the ski academies, like this (1 | skiacademy).

Hard to say without seeing goodness-of-fit information etc. or understanding the variables well, but definitely looks realistic. I’d interpret this as a global smooth function of sectionIndex with shape varying across day, and a smooth function of sectionIndex varying by bib but with common wiggliness. I presume that you do not have the same subjects being observed on different days.

It should be feasible to add more grouping levels. So you could have sectionIndex varying by skiacademy using the bs = fs basis, and then another term for bib within skiacademy also with bs = fs. In principle this behaves analogously to multiple grouping levels for parameters, as in hierarchical linear models ;there is partial pooling at each level, but for functions rather than parameters.

Thanks for a detailed and clear answer. That’s something that I want.

In fact, I have the same subjects observed on different days.