Bayesian difference in means test with BEST and brms packages

The R package “BEST” calculates a simple Bayesian difference in means test. I’m trying to replicate the BEST package results using the brms package because the brms package has more flexibility and I want to learn how it works. My questions are: (1) what do the family link functions do in the brms package? Do they preprocess the data somehow?, (2) In light of the brms family link functions, how do I specify the parameters of the brms prior distributions? Do I specify the parameters as if the data were preprocessed or do I assume no preprocessing?, (3) The built-in priors for the BEST package for “sigma” and “nu” follow the Gamma distribution, which is what I’m trying to replicate. How do I correctly express those priors in the brms code? Thank you. Below is my R code.


y1 <- c(5.77, 5.33, 4.59, 4.33, 3.66, 4.48)

y2 <- c(3.88, 3.55, 3.29, 2.59, 2.33, 3.59)

best_priors <- list(muM = c(6, 4), muSD = 2)

BESTout <- BESTmcmc(y1, y2, priors=best_priors, parallel=FALSE, rnd.seed = 123)



y <- data.frame(y = c(y1, y2), group = c(rep("y1", 6), rep("y2", 6)))

brmout <- brm(
    formula = brmsformula(y ~ 0 + group, sigma ~ 0 + group),
    data    = y,
    family  = student(link = "identity", link_sigma = "log", link_nu = "logm1"),
    prior   = c(
        set_prior("normal(6, 2)",                     class = "b", coef = "groupy1"),
        set_prior("normal(4, 2)",                     class = "b", coef = "groupy2"),
        set_prior("gamma(sd(y1), sd(y1)*5)", class = "b", coef = "groupy1", dpar = "sigma"),
        set_prior("gamma(sd(y2), sd(y2)*5)", class = "b", coef = "groupy2", dpar = "sigma"),
        set_prior("gamma(30, 30)",                class = "nu")
    seed    =   123


brmout <-


(mu1 <- mean(brmout$b_groupy1))

(mu2 <- mean(brmout$b_groupy2))

(muDiff <- mean(brmout$b_groupy1 - brmout$b_groupy2))

(sigma1 <- mean(exp(brmout$b_sigma_groupy1)))

(sigma2 <- mean(exp(brmout$b_sigma_groupy2)))

(sigmaDiff <- sigma1 - sigma2)

(nu <- mean(brmout$nu))

The family determines what sort of distribution the data will be modeled as (e.g., student_t, gaussian, bernoulli etc.).

The link functions can be used to indicate exactly how the model should run under the hood and also inform how you should set your priors. In the thing you have above, the link “identity” means that the MCMC for the mean will just be working with the actual numbers of the data (so if in the raw data you expect a mean of 10, you can set your priors like normal(10, 2.5).

But let’s imagine that you expect a standard deviation around the mean in the raw data to be somewhere around 3 points on the raw scale. You might think that then you should set your prior on sigma to be 3. However, the link = “log” means that the MCMC chain is actually going to be making estimations based on the log of the sample standard deviation, so when thinking about how to set your prior you should instead think about what your plausible standard deviations are and then log transform them. You could set that link function as “identity” and then you would be able to set priors on the raw scale of the data again, but there is probably a reason why it is log by default e.g., maybe it is computationally more efficient.

It doesn’t have anything to do with how you preprocess the data though. brms is just going to take whatever data you give it and try to estimate the parameters how to you tell it to, and it will drop NAs, but that’s it. You need to do any preprocessing steps yourself

Thank you. So if my prior is distributed as Gamma with a location of 20 and a scale of 100, and my link function is “log”, I would code the prior as gamma(log(20), log(1/100)), or if my prior is distributed as Exponential with a location of 30, and my link function is “logm1”, I would code that prior as exponential(1/29)?

Well, probably not exactly like that because the distribution gamma(log(20), log(1/100)) is probably not the same as the distribution of log(gamma(20, 1/100)). Just as an example I show this with a normal distribution below. Basically if you wanted to log version of the prior to match the identity version of the prior you would want to try and approximate the best way of describing log(gamma(20, 1/100)) not gamma(log(20), log(1/100)) - I’m not sure but maybe you can literally set a prior of log(gamma(20, 1/100)) [note that I’m just copy-pasting your gamma dists here, I don’t know exactly how to specify gamma myself]

test_example <- 
  tibble(identity = rnorm(5000, 10, 2.5), # normal dist prior with mean of 10 and sd of 2.5
         log_direct = rnorm(5000, log(10), log(2.5)), # normal dist with mean of log(10) and sd of log(2.5)
         log_identity = log(identity), # log transformed original normal dist
         exp_log_direct = exp(log_direct)) %>%  
  pivot_longer(cols = everything(),
               names_to = "dist")

ggplot(test_example) +
  scale_x_continuous(limits = c(0, 25)) +
  geom_density(aes(x = value, fill = dist), alpha = .33)

You can see that if you exponentiate the distribution where we set it like log(mean) and log(sd), it is very different from our target prior distribution on the identity scale