Default student_t priors in brms

Hello @paul.buerkner, I encounter this prior problem and here are my questions:

  • How is \sigma scale calculated in t-distribution in brms? In the paper, you said the degrees of freedom \nu is 3 by default, and the second argument, presumably, is the median of the data. But I didn’t find an explanation on how the \sigma is calculated.
  • What default priors do population-level effects b use in brms? If I use get_prior, the prior of b is empty, and I didn’t find the prior information by using make_stancode either.

Thank you in advance for your time and help.

sigma is connected to the SD via SD = nu / (nu - 2) sigma, so it is almost the SD for large nu.

an empty prior means improper flat prior in stan and hence in brms as well.

1 Like

Hi Paul, thank you for your reply.
I still have a problem. I checked my data and the prior, the formula doesn’t seem correct. It is a simple linear model y\sim N(1+x,\tau^2). The prior for the intercept that I got from brms is student_t(3, 84.7, 31.3), where \nu=3, \mu=84.7, and the scale \sigma=31.3. However, the sd of y is 25.6, hence \sigma should be 8.54 according to sd=\frac{\nu}{\nu-2}\sigma.
Did I misunderstand the formula? and may I know why did you choose this formula to calculate \sigma? Thank you so much for your time.

Hi Paul, I still couldn’t get the formula run correctly.

x <- rnorm(100)
y <- 1+x
dat <- data.frame(x=x,y=y)
fm <- brmsformula(y~1+x)

Then I have the following priros:

    prior                class coef group resp dpar nlpar bound
                             b    x                            
student_t(3, 1.1, 2.5)   Intercept                                 
student_t(3, 0, 2.5)     sigma

sd(y)= 0.91 and median(y)=1.06. Then the scale should be less than sd.
I couldn’t figure it out. Thank you for your help.

While I think it is worthwhile to be clear on how brms computes it’s default prior, I am not sure it is worthwhile to spend too much time with them - unless automatic prior choice is your area of reasearch. brms just does some guesses to mildly regularize the posterior while making it unlikely the prior biases the inferences. It is always better to provide your own priors motivated by domain knowledge and/or prior predictive checks, so if you have the slightest doubts about default priors, you should probably override them.

Does that make sense? Or is your inquiry primarily investigation of those “guessed” priors themselves?

Also, I moved the discussion to a new topic as it is really separate from the original question.

Hello @martinmodrak, thanks for moving my question to a separate thread. I encountered this problem and was wondering how the default prior was chosen by brms. I agree with you that it is always a good choice for users to select their own priors and run a prior predictive check. However, the mechanism behind brms on how the scaled parameter \sigma was chosen is still unclear. Thank you.

The prior is computed in:

(for intercept this is called with center = FALSE, and the most relevant part is location_y <- round(median(y_link), 1) )

Does that answer your question?


Thank you, @martinmodrak. I found the solution from the link you provided.

scale_y <- round(mad(y_link), 1)
scale <- max(scale, scale_y)

The default scale parameter \sigma that brms uses is 2.5, and uses mad(y), which is the Median Absolute Deviation of y, if it is greater than 2.5. Otherwise, it stays with the default.

The mad is calculated from R with the default parameter constant = 1.4826.

Here are two examples:

x <- 1:100
y <- rnorm(100,sd=1)
dat <- data.frame(x=x,y=y)
fm <- brmsformula(y~1+x)

The prior is student_t(3, 0.1, 2.5) where \sigma=2.5 because mad(y) is 0.8897214<2.5.

x <- 1:100
y <- x+ rnorm(100,sd=1)
dat <- data.frame(x=x,y=y)
fm <- brmsformula(y~1+x)

The prior is student_t(3, 50.6, 38.8) and \sigma=38.8 because mad(y) is 38.7882>2.5.

Thank you so much! It is clear to me now.