How to properly compare interacting levels

I am new to brms, trying to figure out its behaviour in details. I was trying to run essentially an anova-like model with random effects (anova-like in the sense that all IVs are factors). For example,

brm(DV ~
    A * C +
    B * C +
    (1|item) +
    (1|participant),
    ...

conditional_effects() and hypothesis() are truly amazing analytic tools. However, when you deal with factors and, in particular, their interactions, that’s a bugger… It’s really hard to help students wrap their head around it. In addition to that, I am a bit stubborn and I really despise when people, for practical purposes, re-level their factors to figure out what contrasts are significant. There is no need for that. In frequentist framework, one could make use of Wald’s test etc. Here, I sense, the two functions can be of much use, but still one need to be careful around the intercept. All other levels are relative change to that reference.

If one have a simple dummy variable, 0 + Intercept + Dummy + ... is of help (another nice one!). But when you have factors with more levels and their interactions, things are far from straightforward.

What would be the brms way of making this less painful? Running another model without intercept and then use hypothesis()? Something else?

  • Operating System: macOS
  • brms Version: 2.14.4

Thanks!

Hey @striatum, it’s difficult to follow your question. What exactly are you trying to make less painful? It might help if you provide a minimal reproducible example.

Apologies for the lack of clarity. Let me try again, this time with an example. (I am simplifying here, ignoring the possibility to have random effects, some covariates etc.)

# uniformly generated means
m = runif(27, 3, 12)

# random values, random means, same std. dev.
y = list()
for (i in 1:27) {
    y[[i]] = rnorm(30, m[i], 0.8)
}
y = unlist(y)

# factorial design
F1 = c(rep('A', 30), rep('B', 30), rep('C', 30))
F2 = c('I', 'J', 'K')
F3 = c('M', 'N', 'O')
design_matrix = expand.grid(F1=F1, F2=F2, F3=F3)

# dataset
dat = cbind(design_matrix, y)

require(brms)

summary(brm_dat_model1 <- brm(y ~
    (F1 + F2) * F3,
    data = dat,
    chains=4, iter=4000, cores=4))

This gives the summary table (it could be different in another run, given random number generators):

# Population-Level Effects: 
#           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# Intercept     4.56      0.24     4.08     5.02 1.00     3436     5065
# F1B           0.84      0.26     0.33     1.34 1.00     4201     4812
# F1C           3.81      0.26     3.32     4.33 1.00     4398     5180
# F2J           0.71      0.26     0.19     1.23 1.00     3781     5689
# F2K           1.44      0.26     0.92     1.96 1.00     4085     5495
# F3N           4.44      0.33     3.80     5.09 1.00     3615     4724
# F3O           0.89      0.34     0.23     1.55 1.00     3707     5204
# F1B:F3N       0.54      0.37    -0.17     1.26 1.00     4262     4913
# F1C:F3N      -4.21      0.36    -4.93    -3.51 1.00     4586     5286
# F1B:F3O       0.98      0.37     0.27     1.68 1.00     4321     5194
# F1C:F3O      -1.32      0.37    -2.03    -0.59 1.00     4629     5702
# F2J:F3N      -2.87      0.37    -3.60    -2.15 1.00     4271     5376
# F2K:F3N      -4.81      0.37    -5.54    -4.09 1.00     4514     5679
# F2J:F3O      -1.24      0.37    -1.97    -0.51 1.00     4478     5708
# F2K:F3O       0.72      0.37    -0.00     1.43 1.00     4540     5332
# 
# Family Specific Parameters: 
#       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# sigma     1.73      0.04     1.65     1.82 1.00     8879     5554

Now, what causes pains to grasp and work through are various comparisons when levels and combination in the table are relative to the reference: F1==A; F2==I; F3==M.

I’ve found it almost funny (and creative too) when some students approached to this by running comparable models with Intercept removed. Given the example above, something like:

brm_dat_model2 <- brm(y ~
    0 + 
    F1 : F3 + 
    F2 : F3,
    ...)

brm_dat_model3 <- brm(y ~
    0 + 
    F2 : F3 + 
    F3 : F3,
    ...)

And then they apply hypothesis(). For example:

hy = c(hy1='F1A:F3M < F1A:F3N',
    hy2='F1B:F3M == F1C:F3O')
hypothesis(brm_dat_model2, hy)

etc. etc.

This is rather clumsy and inelegant, if not altogether wrong. Yet, I see the struggle and I’m running out of ideas on how to help… In addition to this, it appears that what hypothesis() tells you “contradicts” what you observe in the plot from conditional_effects(). In fact, is there a way to get the BF directly from the conditional_effects() table?

This is clearer. Thank you.

A few things:

It is possible to have multiple factor variables where each level of each factor is expressed in its own terms. One way is with the non-linear syntax. I have an example, here, model b5.11. For more on the non-linear syntax, see Paul’s vignette.

Another way is with the multilevel ANOVA approach. I have examples here and possibly more to your interest here with model fit19.1 and here with fit24.1.

As to contrasts, I typically do these by computing the relevant difference scores by working directly with the output from posterior_samples() or possibly with fitted(). However, if you use the multilevel ANOVA approach, you can use the compare_levels() function from the tidybayes package (see here).

As far as the hypothesis() approach and Bayes factors, I’m personally not a fan of those methods. Someone else will have to offer guidance with those.

2 Likes

Thanks so much for this! I am trying to make the model work, but I am getting an error message:

Error: Factors with more than two levels are not allowed as covariates.
So, following my example above, this is my actual model:

summary.brm.model_zero <- summary(brm.brm.model_zero <- brm(bf(DV ~
    0 + (T + L + S) * C +
    (1|Item) +
    (1|Participant),
        T ~ 0 + type,
        L ~ 0 + GroupsL,
        S ~ 0 + GroupsS,
        C ~ 0 + condition,
        nl=TRUE),
    prior=c(prior(normal(0, 0.5), nlpar=T),
            prior(normal(0, 0.5), nlpar=L),
            prior(normal(0, 0.5), nlpar=S),
            prior(normal(0, 0.5), nlpar=C),
            prior(exponential(1), class=sigma)),
    family=gaussian,
    data = dat,
    chains=4, iter=4000, cores=4,
    control=list(adapt_delta=.95)))

So, it interaction the problem here? Or?

Thanks!

Consider a simplified version of your model from above. Here we’re modeling the criterion y with two factor variables, F1 and F2, and their interaction.

fit <-
  brm(bf(y ~ 0 + a + b + c,
         a ~ 0 + F1,
         b ~ 0 + F2,
         # this is the interaction
         c ~ (0 + F1) : (0 + F2),
         nl = TRUE),
    data = dat)

I concede there might be more elegant ways to express this. But this works.

Thanks so much! May I ask, please, to give me a template that would also contain some random effects?

I haven’t fit a model with that specification, before.

@Solomon , thanks so much! You’ve been super-patient and helpful. To show my gratitude, and give back a bit from what I’ve learned.

This is one of the more (if not the most) elaborate models we’ve been working on recently. And, now, it works under the above specification/approach:

brm_model_zero <- brm(bf(
    Y ~ 0 + Ord + f1 + f2 + f3 + f4 + i1 + i2 + i3 + re1 + re2 + re3,
        Ord ~ 0 + TrialOrder,                 # order of the exp. trial,
                                              # conveniently z-transformed (scaled)
        f1 ~ 0 + Factor1,
        f2 ~ 0 + Factor2,
        f3 ~ 0 + Factor3,
        f4 ~ 0 + Factor4,
        m ~ (0 + Factor1) : (0 + Factor4),
        n ~ (0 + Factor2) : (0 + Factor4),
        o ~ (0 + Factor3) : (0 + Factor4),
        re1 ~ 1 + (1|Item),
        re2 ~ 1 + (1|Participant),
        re3 ~ 1 + (0+TrialOrder|Participant), # by-participant slope adjustments for trials;
                                              # correlation between intercept and slope removed
        nl=TRUE),
    prior=c(prior(normal(0, 0.5), nlpar=f1),
            prior(normal(0, 0.5), nlpar=f2),
            prior(normal(0, 0.5), nlpar=f3),
            prior(normal(0, 0.5), nlpar=f4),
            prior(normal(0, 0.5), nlpar=m),
            prior(normal(0, 0.5), nlpar=n),
            prior(normal(0, 0.5), nlpar=o),
            prior(normal(0, 1.0), nlpar=Ord),
            prior(normal(0, 1.0), nlpar=re1),
            prior(normal(0, 1.0), nlpar=re2),
            prior(normal(0, 1.0), nlpar=re3),
            prior(exponential(1), class=sigma)),
    family=gaussian,
    data = dat,
    chains=4, iter=4000, cores=4,
    control=list(adapt_delta=.95)))

Apart from four factors (fixed effects), there is one covariate that is the order of trial in the experiment. Since it is scaled (z-transformed) then Gaussian prior with s=1.0 is appropriate.

Thanks again!

2 Likes