Hello @paul.buerkner, I encounter this prior problem and here are my questions:
How is \sigma scale calculated in t-distribution in brms? In the paper, you said the degrees of freedom \nu is 3 by default, and the second argument, presumably, is the median of the data. But I didn’t find an explanation on how the \sigma is calculated.
What default priors do population-level effects b use in brms? If I use get_prior, the prior of b is empty, and I didn’t find the prior information by using make_stancode either.
Hi Paul, thank you for your reply.
I still have a problem. I checked my data and the prior, the formula doesn’t seem correct. It is a simple linear model y\sim N(1+x,\tau^2). The prior for the intercept that I got from brms is student_t(3, 84.7, 31.3), where \nu=3, \mu=84.7, and the scale \sigma=31.3. However, the sd of y is 25.6, hence \sigma should be 8.54 according to sd=\frac{\nu}{\nu-2}\sigma.
Did I misunderstand the formula? and may I know why did you choose this formula to calculate \sigma? Thank you so much for your time.
While I think it is worthwhile to be clear on how brms computes it’s default prior, I am not sure it is worthwhile to spend too much time with them - unless automatic prior choice is your area of reasearch. brms just does some guesses to mildly regularize the posterior while making it unlikely the prior biases the inferences. It is always better to provide your own priors motivated by domain knowledge and/or prior predictive checks, so if you have the slightest doubts about default priors, you should probably override them.
Does that make sense? Or is your inquiry primarily investigation of those “guessed” priors themselves?
Also, I moved the discussion to a new topic as it is really separate from the original question.
Hello @martinmodrak, thanks for moving my question to a separate thread. I encountered this problem and was wondering how the default prior was chosen by brms. I agree with you that it is always a good choice for users to select their own priors and run a prior predictive check. However, the mechanism behind brms on how the scaled parameter \sigma was chosen is still unclear. Thank you.
The default scale parameter \sigma that brms uses is 2.5, and uses mad(y), which is the Median Absolute Deviation of y, if it is greater than 2.5. Otherwise, it stays with the default.
The mad is calculated from R with the default parameter constant = 1.4826.
Here are two examples:
set.seed(123)
x ← 1:100
y ← rnorm(100,sd=1)
dat ← data.frame(x=x,y=y)
fm ← brmsformula(y~1+x)
get_prior(fm,data=dat)
The prior is student_t(3, 0.1, 2.5) where \sigma=2.5 because mad(y) is 0.8897214<2.5.
set.seed(123)
x ← 1:100
y ← x+ rnorm(100,sd=1)
dat ← data.frame(x=x,y=y)
fm ← brmsformula(y~1+x)
get_prior(fm,data=dat)
The prior is student_t(3, 50.6, 38.8) and \sigma=38.8 because mad(y) is 38.7882>2.5.