Potential Bug in brms bayes_factor() function

My collaborator and I have been having issues with the bayes_factor() function. Specifically, we’ve been getting inconsistent and outlandish (either 0 or INF) BFs when we are using datasets that include more than a couple hundred people.

To troubleshoot, we checked the models themselves by fitting them using stats (glm), rstanarm (stan_glm) as well as brm(). Our results were consistent across functions, so the bayes_factor() function seems to be the source of the problem.

Has anyone had a similar issue? Attaching reproducible code and datasets so in case that is helpful.

Smaller df (seems to be working properly):

stanNull_E1 <- stan_glm(respNum ~ 1, family = “binomial”,
data = test_df1[test_df1$seen_before == “No”, ],
seed = 1839, diagnostic_file = file.path(tempdir(), “df.csv”))

stanGender_E1 <- stan_glm(respNum ~ gender, family = “binomial”,
data = test_df1[test_df1$seen_before == “No” & surgeon_analyze$gender %in% c(“Male”, “Female”), ],
seed = 1839, diagnostic_file = file.path(tempdir(), “df.csv”))

bayes_gender_E1 <- bayes_factor(bridge_sampler(stanNull_E1), bridge_sampler(stanGender_E1)) ## BF(01) = 3.56

Larger df (seems to be working incorrectly):

stanNull_E4 <- stan_glm(respNum ~ 1, family = “binomial”,
data = test_df1[test_df1$seen_before == 3, ],
seed = 1839, diagnostic_file = file.path(tempdir(), “df.csv”))

stanGender_E4 <- stan_glm(respNum ~ GENDER1, family = “binomial”,
data = test_df1[test_df1$seen_before == 3 &
tess_analyze$GENDER1 != 98, ],
seed = 1839, diagnostic_file = file.path(tempdir(), “df.csv”))

bayes_gender_E4 <- bayes_factor(bridge_sampler(stanNull_E4), bridge_sampler(stanGender_E4))

Thanks in advance for your help.

test_df1.csv (23.5 KB) test_df2.csv (100.1 KB)

Bayes factors require sensible priors to work reasonalbly well. The default priors, which you have seem to used, are to wide for this purpose in most case and so you need to specify reasonable and more informative prior (which is non-trivial in most cases).

Hi Paul,

Thanks so much for your reply. If possible, can you provide a little more information about the diagnostic tool you used to discern that the default priors are too wide in this case? Also, is this true for both test_df1 (n = 600) and test_df2 (N = 3,000)? I only ask because the BFs provided for test_df1 seemed fairly reasonable. Thanks in advance.

brms specifies default priors that might be useful(ish) for estimation but for bayes factors you need more informative and specifically chosen priors. For example, brms has improper flat priors for regression coefficients, which renders bayes factors completely inappropriate if applied with such priors. So basically, never use default priors in brms when trying to estimate bayes factors.