Hello!
I try to run a quite simple multiple regression model for educational purposes. Before running the model I defined priors and wanted to conduct prior predictive checking by running the model with the option “sample_prior = “only””
bf.1 <- bf(bernote ~ wital + witsch + witbo + mtp)
get_prior(bf.1,
family = gaussian(),
data = dat_z)
prior.1 <- c(brms::prior(normal(0, 0.5), class = b, coef = witsch),
brms::prior(normal(0, 0.5), class = b, coef = wital),
brms::prior(normal(0, 0.5), class = b, coef = witbo),
brms::prior(normal(0, 0.5), class = b, coef = mtp),
brms::prior(normal(3.5, 2.5), class = Intercept),
brms::prior(normal(0, 2.5), class = sigma, lb = 0))
validate_prior(prior.1, bf.1,
family =gaussian(),
data = dat_z)
As can be seen, I wanted a normal(3.5, 2.5) prior for the Intercept. The validate_prior() command confirmed that it was set up correctly. Therefore I ran the prior predictive model:
mod.1_ppc <- brm(data = dat_z,
family = gaussian(),
formula = bf.1,
prior = prior.1,
sample_prior = "only")
But when I looked at the visual pp_check and the output of the model, the sampled priors for the Intercept were completely off the limits, where the Intercept had a mean of 5.19 and an SD of 104.48.
> print(mod.1_ppc)
Family: gaussian
Links: mu = identity; sigma = identity
Formula: bernote ~ wital + witsch + witbo + mtp
Data: dat_z (Number of observations: 31)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 5.19 104.58 -202.51 217.19 1.00 4061 2700
wital -0.01 0.50 -0.98 0.96 1.00 4368 2857
witsch 0.00 0.52 -1.00 1.03 1.00 4149 2566
witbo -0.02 0.50 -0.99 0.97 1.00 4326 2897
mtp 0.01 0.50 -0.98 0.99 1.00 4242 2784
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 1.99 1.52 0.07 5.55 1.00 2488 1573
Strangely enough, if I run the actual model and sample the priors with samp_prior = TRUE, the priors look fine:
mod.1 <- brm(data = dat_z,
family = gaussian(),
formula = bf.1,
prior = prior.1,
sample_prior = TRUE,
warmup = 2000, iter = 5000)
prior_draws(mod.1) |> describe()
vars n mean sd median trimmed mad min max range skew kurtosis se
Intercept 1 12000 3.48 2.49 3.49 3.49 2.49 -5.09 13.19 18.27 0.00 -0.03 0.02
b_wital 2 12000 -0.01 0.50 0.00 -0.01 0.50 -1.69 1.84 3.54 0.01 -0.07 0.00
b_witsch 3 12000 0.01 0.50 0.01 0.01 0.50 -1.83 1.84 3.67 -0.04 -0.02 0.00
b_witbo 4 12000 0.00 0.50 0.00 0.00 0.50 -2.12 2.09 4.21 -0.01 -0.02 0.00
b_mtp 5 12000 0.00 0.50 -0.01 -0.01 0.51 -1.86 2.00 3.86 0.02 -0.02 0.00
sigma 6 12000 1.99 1.50 1.70 1.83 1.49 0.00 10.46 10.46 0.97 0.76 0.01
Can someone explain this to me and how to fix this? Did I specify something wrong?
Many thanks for taking your time and help!
Rainer