I have noticed that the parameters of the default priors change with the data. I am curious what the logic is.
library(brms)
set.seed(42)
data <- rskew_normal(1e4, mu = 0, sigma = 1, alpha = -5)
make_stancode(data ~ 1, data, skew_normal)
# // priors including all constants
# target += student_t_lpdf(Intercept | 3, 0.2, 2.5);
# target += student_t_lpdf(sigma | 3, 0, 2.5)
# - 1 * student_t_lccdf(0 | 3, 0, 2.5);
# target += normal_lpdf(alpha | 0, 4);
make_stancode(data ~ 1, 100 * data, skew_normal)
# // priors including all constants
# target += student_t_lpdf(Intercept | 3, 17.1, 97);
# target += student_t_lpdf(sigma | 3, 0, 97)
# - 1 * student_t_lccdf(0 | 3, 0, 97);
# target += normal_lpdf(alpha | 0, 4);
torkar
August 10, 2020, 8:10am
2
Hi,
I think this might answer your question:
Operating System: Ubuntu 18.04
brms Version: 2.10.0
The default prior that brms generates for the intercept seems to use the mean of the outcome variable for the mean of the prior distribution. I always thought that informing the prior by the data is “double-dipping”, i.e., using the data twice, and leads to overconfident inference.
Here some R code to replicate the priors:
library(brms)
get_intercept_prior <- function(stancode) {
code_by_line <- strsplit(stancode, "\n")[[1]]
code_by_li…
Thanks for the reference! So the location parameter is taken to be the median. It does not match the above data, as the median is 0.17, but I suppose it is rounded to one digit after the decimal point. Then the only mystery left is the scale parameter, which is 2.5 and 97 in my example.
torkar
August 10, 2020, 6:56pm
4
Somewhat adhoc choices done in the def_scale_prior function (see https://github.com/paul-buerkner/brms/blob/master/R/priors.R ).
1 Like
That solves it. The logic is roughly as follows:
location <- round(median(data), 1)
scale <- max(round(mad(data), 1), 2.5)
Thanks, everyone!