Bayes factor computation in brms `hypothesis` vs manual calculation

I want to compute Bayes factors for a series of models (please note: I am aware ot the caveat and limitations of BF - this is mainly a minor addition to a paper to appease a reviewer) and I have noticed some incongruencies between the BF provided by the functions hypothesis in brms and a manual calculation from the posterior samples. I am not sure what is going on: is my calculation incorrect, or I am using hypothesis wrongly?

Here is a toy example that reproduce the discrepancy:

library(brms)

# simulate some data
set.seed(1)
d <- data.frame(x = runif(100, min=0, max=1))
d$y <- rnorm(100,0, sd=0.1)

priors <- prior("normal(0, 1)", class = "Intercept") +
  prior("normal(0, 0.3)", class = "b") +
  prior("normal(0, 0.5)", class = "sigma")

# fit model
m <- brm(
  y ~ x, 
  data = d, 
  prior = priors, 
  family = gaussian(link = "identity"),
  sample_prior = 'yes')

Here is my function to compute BF using Savage-Dickey density ratio for a point-null hypothesis (note this assume Gaussian prior for the parameter)

# my own function to compute BF (note this gives evidence for point null)
savage.dickey.bf <- function (x, x_0 = 0, prior.mean = 0, prior.sd = 1, plot = F, breaks=30) {
  require(polspline)
  fit.posterior <- logspline(x)
  posterior_w <- dlogspline(x_0, fit.posterior)
  if (plot) {
    R <- (fit.posterior$range[2] - fit.posterior$range[1])/3
    hist(x, xlab = "parameter value", ylab = "density", breaks=breaks,
         col="grey", border="white", freq=FALSE,
         xlim = c(fit.posterior$range[1] - R, fit.posterior$range[2] + 
                    R), main="")
    plot(fit.posterior, xlab = "parameter value", ylab = "density", 
         lwd = 2, xlim = c(fit.posterior$range[1] - R, fit.posterior$range[2] + 
                             R), add=T)
    
    x <- seq(fit.posterior$range[1] - R, fit.posterior$range[2] + 
               R, length.out = 500)
    lines(x, dnorm(x, mean = prior.mean, sd = prior.sd), 
          col = "red", lwd = 2)
    abline(v = x_0, lty = 2)
    points(x_0, posterior_w, pch = 19, col = "black")
    points(x_0, dnorm(x_0, prior.mean, prior.sd), pch = 19, 
           col = "red")
    legend("topright", c("posterior", "prior"), lwd = 2, 
           col = c("black", "red"), pch = 19, bty = "n", inset = 0.02)
  }
  cat(paste0("Approximate BF (Savage-Dickey) in favor of null x=", 
             x_0, " : ", round(posterior_w/dnorm(x_0, prior.mean, 
                                                 prior.sd), digits = 2), "\n"))
  invisible(posterior_w/dnorm(x_0, prior.mean, prior.sd))
}

My understanding is that this should be equivalent to what hypothesis does for a two-sided test. From the hypothesis help page:

For an two-sided (point) hypothesis, the evidence ratio is a Bayes factor between the hypothesis and its alternative computed via the Savage-Dickey density ratio method. That is the posterior density at the point of interest divided by the prior density at that point. Values greater than one indicate that evidence in favor of the point hypothesis has increased after seeing the data.

However the results are quite different

brms

> test_brms <- hypothesis(m, "x = 0")
> print(test_brms)
Hypothesis Tests for class b:
  Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob Star
1    (x) = 0     0.03      0.04    -0.04      0.1       5.82      0.85     
---
'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
'*': For one-sided hypotheses, the posterior probability exceeds 95%;
for two-sided hypotheses, the value tested against lies outside the 95%-CI.
Posterior probabilities of point hypotheses assume equal prior probabilities.
> 1/test_brms$hypothesis$Estimate
[1] 32.54378

my calculation

> post_samples <- posterior_samples(m)
> savage.dickey.bf(post_samples$b_x,  x_0 = 0, prior.mean = 0, prior.sd=0.3, plot=T)
Approximate BF (Savage-Dickey) in favor of null x=0 : 5.7

Note that my function also provide a plot of the posterior, which (I think) shows that the number I get is correct one (5.7) whereas hypothesis gives a value that seems quite off (32.54).

image

  • Operating System: Ubuntu 20.04
  • brms Version: 2.20.4

It looks like brms gives the Evidence Ratio as 5.82… This seems pretty close to your estimate?

Right… I was reading the output wrong… thanks!

What is the ‘Estimate’ value outputted by hypothesis then?

1 Like

I don’t ever use Bayes Factors, so I could be missing something, but I think it is the estimated difference between your parameter estimate and your hypothesis. Instead of saying (x) = 0 in the output under the Hypothesis column, I think it could have been written as x - 0 = 0. For example, if you had an x1 and x2 parameters in your model and you specified hypothesis(m, "x1 = x2"), then the Estimate would be the difference in the two parameters, x1 - x2. See Matti’s homepage - Bayes Factors with brms for example.