Strange Results from BridgeSampler

Regarding your second point, how many posterior samples have you obtained for each model? In order to get stable Bayes Factor results, you need many more posterior samples than you would typically need for parameter estimation. See for example this thread about calculating Bayes Factors for brms models - increasing the number of samples from something like iter = 2000, warmup = 1000, chains = 4 to iter = 10000, warmup = 1000, chains = 4 apparently yielded more stable results.

1 Like