I am trying to understand how to set priors correctly in a shifted log-normal mixed model.
Let’s consider the following model with one fixed predictor, Species
as random intercept for the outcome as well as for the auxiliary parameters (+ parameters-correlations estimation):
library(ggplot2)
library(brms)
formula <- brms::brmsformula(
Sepal.Length ~ Sepal.Width + (1|G|Species),
ndt ~ 1 + (1|G|Species),
sigma ~ 1 + (1|G|Species),
family = shifted_lognormal()
)
m <- brms::brm(formula, data = iris, refresh = 0, iter = 2000, seed = 33)
pp_check(m, nsamples = 500)
Here are the parameters of that model:
# Print Parameters
p <- as.data.frame(parameters::parameters(m, effects = "all", dispersion = TRUE))
p[c("Parameter", "Median", "MAD", "Rhat")]
Parameter Median MAD Rhat
1 b_Intercept 1.3e+00 1.8e-01 1.0
2 b_sigma_Intercept -2.6e+00 2.9e-01 1.0
3 b_ndt_Intercept -2.8e+06 2.5e+06 1.3
4 b_Sepal.Width 1.4e-01 1.5e-02 1.0
23 sd_Species__Intercept 3.0e-01 1.9e-01 1.0
24 sd_Species__ndt_Intercept 2.0e+00 1.8e+00 1.0
25 sd_Species__sigma_Intercept 5.1e-01 3.5e-01 1.0
26 cor_Species__Intercept__ndt_Intercept -7.0e-03 5.7e-01 1.0
27 cor_Species__Intercept__sigma_Intercept 4.0e-01 5.4e-01 1.0
28 cor_Species__ndt_Intercept__sigma_Intercept 4.7e-03 6.0e-01 1.0
(Note that one of them has an Rhat > 1.1 and a very high dispersion, but the model seems to make sensical predictions nonetheless).
Problem: I want to understand how the priors can impact the model, and investigate more informative priors, which is tricky given the relatively abstract nature of that model.
However, I cannot sample from the model’s priors above (to see how the model performs with only priors), as it has flat priors. So I have to define informative priors manually.
So my approach would be to start with a model which priors are the same as the “working” posteriors found above. From there, I’d investigate how modulating them impact the model’s predictions. My assumption was that sampling from a model’s priors which are very informative priors will behave similarly as sampling from a model with the same priors - as posteriors.
So I went on to set priors following the results found above:
# To see default priors: get_prior(formula, data = iris)
priors <- brms::validate_prior(c(
# Intercept -----------------
set_prior('normal(1.3, 0.18)', class = 'Intercept'), # b_Intercept
set_prior('normal(-2.6, 0.29)', class = 'Intercept', dpar = 'sigma'), # b_sigma_Intercept
set_prior('normal(0, 250000)', class = 'Intercept', dpar = 'ndt'), # b_ndt_Intercept
# SDs -----------------
set_prior('normal(0.3, 0.19)', class = 'sd', group = c("", "Species")), # sd_Species__Intercept
set_prior('normal(2, 1.8)', class = 'sd', dpar = 'ndt', group = c("", "Species")), # sd_Species__ndt_Intercept
set_prior('normal(0.51, 0.35)', class = 'sd', dpar = 'sigma', group = c("", "Species")), # sd_Species__sigma_Intercept
# Fixed -----------------
set_prior('normal(0.14, 0.015)', class = 'b', coef = "Sepal.Width") # b_Sepal.Width
), formula, data = iris)
priors
prior class coef group resp dpar nlpar bound source
(flat) b default
normal(0.14, 0.015) b Sepal.Width user
normal(1.3, 0.18) Intercept user
normal(0, 250000) Intercept ndt user
normal(-2.6, 0.29) Intercept sigma user
lkj_corr_cholesky(1) L default
lkj_corr_cholesky(1) L Species (vectorized)
normal(0.3, 0.19) sd user
normal(2, 1.8) sd ndt user
normal(0.51, 0.35) sd sigma user
normal(0.3, 0.19) sd Species user
normal(0.3, 0.19) sd Intercept Species (vectorized)
normal(2, 1.8) sd Species ndt user
normal(2, 1.8) sd Intercept Species ndt (vectorized)
normal(0.51, 0.35) sd Species sigma user
normal(0.51, 0.35) sd Intercept Species sigma (vectorized)
The problem is that when sampling from the priors of such model, it gives either Inf or very very high values… I tried tinkering with the parameters but I cannot seem to set them properly to make predictions within a sensical range…
# Model fitting
m_priors <- brms::brm(formula, data = iris,
prior = priors, sample_prior = "only",
refresh = 0, seed = 33)
pp_check(m_priors, nsamples = 500)
Estimate Est.Error Q2.5 Q97.5
[1,] Inf NaN 1.9 Inf
[2,] Inf NaN 1.8 Inf
[3,] Inf NaN 1.8 Inf
[4,] Inf NaN 1.8 Inf
[5,] Inf NaN 2.0 Inf
[6,] Inf NaN 2.1 Inf
[7,] Inf NaN 1.9 Inf
[8,] Inf NaN 1.9 Inf
[9,] Inf NaN 1.8 Inf
[10,] Inf NaN 1.8 Inf
Any help or suggestions on how to set informative priors for such model is more than welcome!