Example of setting weakly informative priors in a hurdle brms model

As much as my limited understanding says, it would be appropriate to define priors for the model, which results I am going to publish (a correct Bayesian way). However, how to code my prior knowledge?

I have a hurdle model that would need weakly informative priors for all predictors at once, if possible. N = 5000 patients.

fit = brm(bf(received_treatment_hours ~ p1 + p2 + p3_fct + p4 + p5_fct + p6 + p7 + (1 | region), hu ~ p1 + p2 + p3_fct + p4 + p5_fct + p6 + p7 + (1 | region)), data = df, family = hurdle_lognormal(), cores = 3, chains = 3, prior = prior)

Histogram of outcome variable - zero inflated and there are also some extreme values

My prior knowledge about receiving treatment for lognormal part of the model:

Differenced more than 30 hours in received treatment hours are unlikely for between different predictor levels.

Differences more than 30 hours in received treatment hours are unlikely between the regions.

Thus, my prior for lognormal part should be:

prior = c(prior(student(3, 0, 15), class=b), #allows extreme values and 2xSD = 2x15 = 30 hours
           prior(student_t(3, 0, 15), class= sd, group = county))  #prior for hierarchical part of the model, allows extreme values and 2xSD = 2x15 = 30 hours

But how to complement the prior for the hurdle part of the model?

I know that the proportion of zero values ranged quite a bit between the different levels of the predictors. From 5% up to 95%

I know that the proportion of zero values ranged quite a bit between the regions. From 10% up to 80%.

Finally, does my model have other parts that would need priors?

1 Like

I also add the prior summary. Can anyone give some advice, how to define this prior?

prior_summary(fit)
prior class coef group resp dpar nlpar bound
b
b p1
b p2
b p3female
b p4
b p5a
b p5b
b p5c
b p6c
b p6d
b p7
b hu
b p1 hu
b p2 hu
b p3female hu
b p4 hu
b p5a hu
b p5b hu
b p5c hu
b p6c hu
b p6d hu
b p7 hu
student_t(3, -2, 10) Intercept
logistic(0, 1) Intercept hu
student_t(3, 0, 10) sd
student_t(3, 0, 10) sd hu
sd region
sd Intercept region
sd region hu
sd Intercept region hu
student_t(3, 0, 10) sigma

Hey there! Sorry it took us some time to reply. I’m not a brms user myself, but let me see if I can help.

Note, however, that the sd of the Student-t is not its scale (15 in your case). With 3 df and a scale of 15 the sd is more like ~26. In general for the LogNormal, those coefficients are on the log- scale and not on the scale of your outcome. You can think of them in approximately percentage changes (or elasticities), which means informative priors would be waaaaay smaller. For thos kind of model I usually go with normal(0,1) weakly informative priors (and these are still pretty wide).

I think you need to set the dpar option in the prior function to "hu" and then set the respective class ("b" for regression coefficients and so on) just like for your “main” regression.

You can put your own priors on all model parameters. Usually brms comes up with reasonable defaults (although they tend to be on the less informative side, which is probably a very good thing for default priors). These defaults are what you see in the prior summary. The prior summary also should give you an idea of how to set the priors yourself (class, dpar, etc.).

Hope this helps. :)
Cheers!
Max