How to properly compare interacting levels

striatum · January 26, 2021, 8:27pm

I am new to brms, trying to figure out its behaviour in details. I was trying to run essentially an anova-like model with random effects (anova-like in the sense that all IVs are factors). For example,

brm(DV ~
    A * C +
    B * C +
    (1|item) +
    (1|participant),
    ...

conditional_effects() and hypothesis() are truly amazing analytic tools. However, when you deal with factors and, in particular, their interactions, that’s a bugger… It’s really hard to help students wrap their head around it. In addition to that, I am a bit stubborn and I really despise when people, for practical purposes, re-level their factors to figure out what contrasts are significant. There is no need for that. In frequentist framework, one could make use of Wald’s test etc. Here, I sense, the two functions can be of much use, but still one need to be careful around the intercept. All other levels are relative change to that reference.

If one have a simple dummy variable, 0 + Intercept + Dummy + ... is of help (another nice one!). But when you have factors with more levels and their interactions, things are far from straightforward.

What would be the brms way of making this less painful? Running another model without intercept and then use hypothesis()? Something else?

Operating System: macOS
brms Version: 2.14.4

Thanks!

Solomon · January 29, 2021, 12:39pm

Hey @striatum, it’s difficult to follow your question. What exactly are you trying to make less painful? It might help if you provide a minimal reproducible example.

striatum · January 30, 2021, 10:48am

Apologies for the lack of clarity. Let me try again, this time with an example. (I am simplifying here, ignoring the possibility to have random effects, some covariates etc.)

# uniformly generated means
m = runif(27, 3, 12)

# random values, random means, same std. dev.
y = list()
for (i in 1:27) {
    y[[i]] = rnorm(30, m[i], 0.8)
}
y = unlist(y)

# factorial design
F1 = c(rep('A', 30), rep('B', 30), rep('C', 30))
F2 = c('I', 'J', 'K')
F3 = c('M', 'N', 'O')
design_matrix = expand.grid(F1=F1, F2=F2, F3=F3)

# dataset
dat = cbind(design_matrix, y)

require(brms)

summary(brm_dat_model1 <- brm(y ~
    (F1 + F2) * F3,
    data = dat,
    chains=4, iter=4000, cores=4))

This gives the summary table (it could be different in another run, given random number generators):

# Population-Level Effects: 
#           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# Intercept     4.56      0.24     4.08     5.02 1.00     3436     5065
# F1B           0.84      0.26     0.33     1.34 1.00     4201     4812
# F1C           3.81      0.26     3.32     4.33 1.00     4398     5180
# F2J           0.71      0.26     0.19     1.23 1.00     3781     5689
# F2K           1.44      0.26     0.92     1.96 1.00     4085     5495
# F3N           4.44      0.33     3.80     5.09 1.00     3615     4724
# F3O           0.89      0.34     0.23     1.55 1.00     3707     5204
# F1B:F3N       0.54      0.37    -0.17     1.26 1.00     4262     4913
# F1C:F3N      -4.21      0.36    -4.93    -3.51 1.00     4586     5286
# F1B:F3O       0.98      0.37     0.27     1.68 1.00     4321     5194
# F1C:F3O      -1.32      0.37    -2.03    -0.59 1.00     4629     5702
# F2J:F3N      -2.87      0.37    -3.60    -2.15 1.00     4271     5376
# F2K:F3N      -4.81      0.37    -5.54    -4.09 1.00     4514     5679
# F2J:F3O      -1.24      0.37    -1.97    -0.51 1.00     4478     5708
# F2K:F3O       0.72      0.37    -0.00     1.43 1.00     4540     5332
# 
# Family Specific Parameters: 
#       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# sigma     1.73      0.04     1.65     1.82 1.00     8879     5554

Now, what causes pains to grasp and work through are various comparisons when levels and combination in the table are relative to the reference: F1==A; F2==I; F3==M.

I’ve found it almost funny (and creative too) when some students approached to this by running comparable models with Intercept removed. Given the example above, something like:

brm_dat_model2 <- brm(y ~
    0 + 
    F1 : F3 + 
    F2 : F3,
    ...)

brm_dat_model3 <- brm(y ~
    0 + 
    F2 : F3 + 
    F3 : F3,
    ...)

And then they apply hypothesis(). For example:

hy = c(hy1='F1A:F3M < F1A:F3N',
    hy2='F1B:F3M == F1C:F3O')
hypothesis(brm_dat_model2, hy)

etc. etc.

This is rather clumsy and inelegant, if not altogether wrong. Yet, I see the struggle and I’m running out of ideas on how to help… In addition to this, it appears that what hypothesis() tells you “contradicts” what you observe in the plot from conditional_effects(). In fact, is there a way to get the BF directly from the conditional_effects() table?

Solomon · February 1, 2021, 4:36am

This is clearer. Thank you.

A few things:

It is possible to have multiple factor variables where each level of each factor is expressed in its own terms. One way is with the non-linear syntax. I have an example, here, model b5.11. For more on the non-linear syntax, see Paul’s vignette.

Another way is with the multilevel ANOVA approach. I have examples here and possibly more to your interest here with model fit19.1 and here with fit24.1.

As to contrasts, I typically do these by computing the relevant difference scores by working directly with the output from posterior_samples() or possibly with fitted(). However, if you use the multilevel ANOVA approach, you can use the compare_levels() function from the tidybayes package (see here).

As far as the hypothesis() approach and Bayes factors, I’m personally not a fan of those methods. Someone else will have to offer guidance with those.

striatum · February 1, 2021, 11:48am

Thanks so much for this! I am trying to make the model work, but I am getting an error message:

Error: Factors with more than two levels are not allowed as covariates.
So, following my example above, this is my actual model:

summary.brm.model_zero <- summary(brm.brm.model_zero <- brm(bf(DV ~
    0 + (T + L + S) * C +
    (1|Item) +
    (1|Participant),
        T ~ 0 + type,
        L ~ 0 + GroupsL,
        S ~ 0 + GroupsS,
        C ~ 0 + condition,
        nl=TRUE),
    prior=c(prior(normal(0, 0.5), nlpar=T),
            prior(normal(0, 0.5), nlpar=L),
            prior(normal(0, 0.5), nlpar=S),
            prior(normal(0, 0.5), nlpar=C),
            prior(exponential(1), class=sigma)),
    family=gaussian,
    data = dat,
    chains=4, iter=4000, cores=4,
    control=list(adapt_delta=.95)))

So, it interaction the problem here? Or?

Thanks!

Solomon · February 1, 2021, 5:50pm

Consider a simplified version of your model from above. Here we’re modeling the criterion y with two factor variables, F1 and F2, and their interaction.

fit <-
  brm(bf(y ~ 0 + a + b + c,
         a ~ 0 + F1,
         b ~ 0 + F2,
         # this is the interaction
         c ~ (0 + F1) : (0 + F2),
         nl = TRUE),
    data = dat)

I concede there might be more elegant ways to express this. But this works.

striatum · February 1, 2021, 6:54pm

Thanks so much! May I ask, please, to give me a template that would also contain some random effects?

Solomon · February 1, 2021, 8:45pm

I haven’t fit a model with that specification, before.

striatum · February 2, 2021, 6:16pm

@Solomon , thanks so much! You’ve been super-patient and helpful. To show my gratitude, and give back a bit from what I’ve learned.

This is one of the more (if not the most) elaborate models we’ve been working on recently. And, now, it works under the above specification/approach:

brm_model_zero <- brm(bf(
    Y ~ 0 + Ord + f1 + f2 + f3 + f4 + i1 + i2 + i3 + re1 + re2 + re3,
        Ord ~ 0 + TrialOrder,                 # order of the exp. trial,
                                              # conveniently z-transformed (scaled)
        f1 ~ 0 + Factor1,
        f2 ~ 0 + Factor2,
        f3 ~ 0 + Factor3,
        f4 ~ 0 + Factor4,
        m ~ (0 + Factor1) : (0 + Factor4),
        n ~ (0 + Factor2) : (0 + Factor4),
        o ~ (0 + Factor3) : (0 + Factor4),
        re1 ~ 1 + (1|Item),
        re2 ~ 1 + (1|Participant),
        re3 ~ 1 + (0+TrialOrder|Participant), # by-participant slope adjustments for trials;
                                              # correlation between intercept and slope removed
        nl=TRUE),
    prior=c(prior(normal(0, 0.5), nlpar=f1),
            prior(normal(0, 0.5), nlpar=f2),
            prior(normal(0, 0.5), nlpar=f3),
            prior(normal(0, 0.5), nlpar=f4),
            prior(normal(0, 0.5), nlpar=m),
            prior(normal(0, 0.5), nlpar=n),
            prior(normal(0, 0.5), nlpar=o),
            prior(normal(0, 1.0), nlpar=Ord),
            prior(normal(0, 1.0), nlpar=re1),
            prior(normal(0, 1.0), nlpar=re2),
            prior(normal(0, 1.0), nlpar=re3),
            prior(exponential(1), class=sigma)),
    family=gaussian,
    data = dat,
    chains=4, iter=4000, cores=4,
    control=list(adapt_delta=.95)))

Apart from four factors (fixed effects), there is one covariate that is the order of trial in the experiment. Since it is scaled (z-transformed) then Gaussian prior with s=1.0 is appropriate.

Thanks again!

Topic		Replies	Views
How to code post-hoc comparisons with interactions when using brms and the "hypothesis" function brms brms	4	1038	April 26, 2021
Using `brms:hypothesis` correctly with interactions brms hypothesis-testing , brms	0	396	January 12, 2023
How to get hypothesis() to test difference between level combination in a regression with interaction brms hypothesis-testing , brms	1	402	June 27, 2023
Use "hypothesis" function for cumulative (probit) model in brms brms hypothesis-testing , brms	6	1052	September 13, 2021
Go-to modeling method for factorial experimental designs (with >2 levels) brms fitting-issues , specification , hypothesis-testing	0	527	July 5, 2022

How to properly compare interacting levels

Related topics