I get a much higher BF with brms than expected on the basis of lmer

Hi,

I’m trying to understand brms to see whether I can use it for my research and teach it to my students. For this, I developed a toy dataset, in which 10 participants name 5 low-frequency words and 5 high-frequency words (see also the file with the long notation below).

Participant Low1 Low2 Low3 Low4 Low5 High1 High2 High3 High4 High5
p1 655 847 588 687 603 652 639 706 633 593
p2 724 954 653 624 613 649 642 505 659 725
p3 589 763 614 688 589 639 589 638 596 631
p4 647 712 769 594 622 714 566 684 652 545
p5 842 741 698 711 657 598 639 652 681 684
p6 630 863 647 659 688 655 685 701 706 576
p7 711 712 589 624 637 689 625 720 599 703
p8 652 914 723 599 725 675 750 692 618 690
p9 483 752 642 602 568 497 504 615 587 605
p10 756 811 699 705 718 637 549 587 675 636

The dataset is such that the F1 analysis (ANOVA across participants) is highly significant, but the F2 analysis (ANOVA across stimuli) is not at all, because almost all the difference is due to one stimulus (Low2).

In lmer I find that the frequency is not significant (as it shouldn’t). In BayesFactor I also find that the models with and without frequency have the same likelihood.

However, when I do the analysis with brms, I’m told that the model with Frequency is 200 times more likely than the model without. Obviously I am doing something wrong, but I can’t find what.

These are the commands I used:

library(lmerTest)

mix_model <- lmer(Score ~ Frequency + (Frequency|Participant) + (1|Stimulus), data=data)
anova(mix_model)
summary(mix_model)

mix_model2 <- lmer(Score ~ Frequency + (1|Participant) + (1|Stimulus), data=data)
anova(mix_model2)
summary(mix_model2)

anova(mix_model,mix_model2)

library(‘BayesFactor’)
dat <- data
dat$Frequency=as.factor(dat$Frequency)
dat$Stimulus=as.factor(dat$Stimulus)
dat$Participant=as.factor(dat$Participant)
bf <- anovaBF(Score ~ Frequency + Stimulus + Participant + Frequency:Participant, whichRandom = c(“Stimulus”,“Participant”), data = dat)
*bf

library(brms)
mix_modelnull <- brm(Score ~ 1 + (1|Participant) + (1|Stimulus), data=data, save_all_pars = TRUE)
mix_modelintercept <- brm(Score ~ Frequency + (1|Participant) + (1|Stimulus), data=data, save_all_pars = TRUE)
mix_modelfull <- brm(Score ~ Frequency + (Frequency|Participant) + (1|Stimulus), data=data, save_all_pars = TRUE)
mix_modelnull
mix_modelintercept
mix_modelfull
bayes_factor(mix_modelfull,mix_modelintercept)
bayes_factor(mix_modelfull,mix_modelnull)
bayes_factor(mix_modelintercept,mix_modelnull)

Would you be able to give me feedback about where I am going astray?

Dataset word naming long notation.csv (1.9 KB)

As you know, Bayes factors are very sensitive to the choice of prior distributions and the default priors of BayesFactor and of brms surely vary by a lot. Even more problematic, the default prior in brms for regression coefficients (excluding the intercept) is an improper flat prior over the real line. This is not a prior usable (in any reasonable manner) in combination with Bayes factors (they are just not the main playground of brms).

You may try to specifiy a proper prior on the regresson coefficients for instance via prior = prior(normal(0, 5), class = "b").

In any case, I would expect different results just because priors will vary heavily between implementations.

You may also try out other ways to compare models for instance via loo().