brms already does something similar at least for the intercept and sigma:
brms::default_prior(speed ~ dist, cars)
#> prior class coef group resp dpar nlpar lb ub source
#> (flat) b default
#> (flat) b dist (vectorized)
#> student_t(3, 15, 5.9) Intercept default
#> student_t(3, 0, 5.9) sigma 0 default
Created on 2024-04-04 with reprex v2.1.0
You can take a look at the .default_prior
function code to see how it does it. The final function to which all is passed eventually is def_scale_prior.brmsterms
So in principle it should be possible to replicate that behavior, but the caveat is that you wouldn’t just be able to pass a def_priors()
function to the prior
argument. You will need to pass the formula and the data to def_priors(formula, data)
…
Actually, by exploiting some magic with environments, you could make it work. Here’s a functioning example:
suppressPackageStartupMessages(library(brms))
def_prior <- function() {
# get the environment of the brm parent call and relevant variables
env <- sys.frame(1)
bterms <- env$bterms
data <- env$data
# get the response variable
mframe <- model.frame(bterms, data)
y <- unname(model.response(mframe))
preds <- model.matrix(bterms$formula, mframe) # remove intercept
# intercept prior
sd_y <- sd(y)
mean <- mean(y)
prior <- paste0("normal(", round(mean, 2), ", ", round(2.5 * sd_y, 2), ")")
prior <- set_prior(prior, class = "Intercept")
# effects prior (first column is intercept so skip it)
for (i in 2:ncol(preds)) {
sd <- sd(preds[,i])
prior_pred <- paste0("normal(", 0, ", ", round(2.5 * sd_y/sd, 2), ")")
prior <- prior + set_prior(prior_pred, class = 'b', coef = colnames(preds)[i])
}
# sigma prior
prior_sigma <- paste0("exponential(", round(1/sd_y, 2), ")")
prior <- prior + set_prior(prior_sigma, class = "sigma")
prior
}
a <- brm(speed ~ dist, cars,
prior = def_prior(),
empty = TRUE)
prior_summary(a)
prior class coef group resp dpar nlpar lb ub source
(flat) b default
normal(0, 0.51) b dist user
normal(15.4, 13.22) Intercept user
exponential(0.19) sigma 0 user
I was quite surprised this works, but it was fun to figure out! Of course, you will need to be careful about figuring out when centering is used, the presence of random effects, whether sigma is predicted (it changes the link function to log), but you could use this as a starting point.
Not sure how robust or reliable this approach would be. Perhaps @paul.buerkner could chime in