- Operating System: Ubuntu 18.04
- brms Version: 2.10.0
The default prior that brms generates for the intercept seems to use the mean of the outcome variable for the mean of the prior distribution. I always thought that informing the prior by the data is “double-dipping”, i.e., using the data twice, and leads to overconfident inference.
Here some R code to replicate the priors:
library(brms)
get_intercept_prior <- function(stancode) {
code_by_line <- strsplit(stancode, "\n")[[1]]
code_by_line[grepl("student_t_lpdf\\(Intercept ", code_by_line)]
}
# a model
formula <- y ~ 1 + (1 | task) + (1 || school / student)
# some data for brms::make_stancode
nTasks <- 3
nSchools <- 10
nStudents <- 200
idxTasks <- sample(nTasks, nStudents, replace = TRUE)
idxSchools <- sample(nSchools, nStudents, replace = TRUE)
idxStudent <- seq_len(nStudents)
df <- data.frame(
task = idxTasks,
school = idxSchools,
student = idxStudent
)
# generate stan code for multiple values of y
df$y <- 0
stancode <- make_stancode(formula, df) # mean of 0
get_intercept_prior(stancode)
# target += student_t_lpdf(Intercept | 3, 0, 10);
df$y <- 5
stancode <- make_stancode(formula, df) # mean of 5
get_intercept_prior(stancode)
# target += student_t_lpdf(Intercept | 3, 5, 10);
df$y <- 10
stancode <- make_stancode(formula, df) # mean of 10
get_intercept_prior(stancode)
# target += student_t_lpdf(Intercept | 3, 10, 10);
Could someone perhaps explain why this is a good idea? Does it improve the sampling in Stan? Can anybody recommend a paper that advocates for/ explains this choice of prior?