I am fitting a hurdle gamma model to some data. I understand that the default prior for the hurdle is suspicious of values close to 0 and 1 and expects values around 0.5. However, I believe that around 0.5-10% of values are positive and I am open to this being arbitrarily close to 0.
Whilst I am comfortable with ‘greek’ equation-based formulations of Bayesian models, I am not yet used to converting this to brms/Stan code. That means I’m struggling to interpret both the output of ‘prior_summary’ and ‘stancode’.
I suppose there are two questions:
- My current guess is to change the logistic regression distribution for “b_hu_Intercept” by putting in
prior = prior(logistic(3,1), class = "Intercept", dpar = "hu")
. Does this seem appropriate? Can I improve on this prior? - More generally, are there any tips for people used to equation formulations of Bayesian models to better understand brms and/or Stan models? I have seen that the package greta offers a diagrammatic representation of Bayesian models (reference) that I find easier to interpret than ‘prior_summary’. I am expecting the answer here to be ‘learn Stan and read the code’!
I have attached an example simulated_data.csv (743.7 KB) dataset and example code to help focus the discussion.
brms_fit <- brms::brm(bf(y ~ 1 + (1 | x_1) + (1 | x_2) + (1 | group),
shape ~ 1 + (1 | x_1) + (1 | x_2) + (1 | group),
hu ~ 1 + (1 | x_1) + (1 | x_2) + (1 | group)),
family = hurdle_gamma(),
iter = 100,
cores = 8, chains = 8,
data = simulated_data)