Choosing priors for a brms hurdle model

I am fitting a hurdle gamma model to some data. I understand that the default prior for the hurdle is suspicious of values close to 0 and 1 and expects values around 0.5. However, I believe that around 0.5-10% of values are positive and I am open to this being arbitrarily close to 0.

Whilst I am comfortable with ‘greek’ equation-based formulations of Bayesian models, I am not yet used to converting this to brms/Stan code. That means I’m struggling to interpret both the output of ‘prior_summary’ and ‘stancode’.

I suppose there are two questions:

  1. My current guess is to change the logistic regression distribution for “b_hu_Intercept” by putting in prior = prior(logistic(3,1), class = "Intercept", dpar = "hu"). Does this seem appropriate? Can I improve on this prior?
  2. More generally, are there any tips for people used to equation formulations of Bayesian models to better understand brms and/or Stan models? I have seen that the package greta offers a diagrammatic representation of Bayesian models (reference) that I find easier to interpret than ‘prior_summary’. I am expecting the answer here to be ‘learn Stan and read the code’!

I have attached an example simulated_data.csv (743.7 KB) dataset and example code to help focus the discussion.

brms_fit <- brms::brm(bf(y ~ 1 + (1 | x_1) + (1 | x_2) + (1 | group),
                         shape ~ 1 + (1 | x_1) + (1 | x_2) + (1 | group),
                          hu ~ 1 + (1 | x_1) + (1 | x_2) + (1 | group)),
          family = hurdle_gamma(),
          iter = 100,
          cores = 8, chains = 8,
          data = simulated_data)
1 Like

Sorry it looks like your question fell through. Maybe @Guido_Biele is not busy and can answer?

1 Like

I’ve only access to my phone for the next days, so here are a few short hints:

  • it’s useful to look at the Stan code, ‘@model’ in the brmsfit object, together with the output of prior_summary. If you start with a simple model (no random effects) the brms-generated Stan models can be self-explanatory.
  • you can use the functions make_stancode and get_prior to output Stan models and priors before compiling the model.
  • to check the effects of priors (prior predictive check) you can run a model with the option sample_prior = “only”. Doing this is in my experience important when you.are dealing with models that use link functions that are not “identify”. It’s worth having a look at how brms models shape and rate of the gamma.
1 Like