My question concerns a specific functionality of brms in the context of defining linear mixed models. The model I am trying to define aims at predicting a bimodal metric variable. The intercept of this model can be assumed to closely follow a gamma distribution, which I therefore used as a prior with the following syntax:
prior = set_prior(“gamma(25, 0.02)”, class = “Intercept”)
This works well. However, I also expect that about 5% of the observations of the dependent variable should be 0, which is not modeled by this gamma distribution. I therefore wanted to include this information in the prior. My question is whether this is possible with the current version of brms? My (naive?) approach so far was to try to define a gamma hurdle distribution instead of a gamma distribution, but I am not sure how to define such a prior; something like
prior = set_prior(“gamma_hurdle(25, 0.02, 0.05)”, class = “Intercept”)
obviously does not work.
- Operating System: Windows 10
- brms Version: 2.9.0
I think you are looking for
family = hurdle_gamma(), or not?
Thank you for the hint, but I am not sure if this solves my problem. I would like to set the distribution parameters of the gamma hurdle prior (e.g., the shape and rate parameters), and I found no way to do so using family = hurdle_gamma(). Have I overlooked something?
-----“Max Mantei via Stan” firstname.lastname@example.org schrieb: -----
Setting a bimodal prior for the intercept of a (regression?) model seems kinda unusual to me. The term gamma hurdle is usually referring to a model which assigns a Gamma likelihood to the positive real part of the outcome variable and a point mass to zero outcomes. That’s why I thought you confused prior (for intercept) and likelihood (for outcome). Can you give a bit more detail about the model and data that you are trying to model?
I cannot provide the full details of the study, but I can provide some context for these data. The dependent variable is basically an entropy-based measure that aims at measuring the information contained in data transmissions under various settings. If no data transmission takes place, the information and therefore also the dependent variable is 0, otherwise it takes on positive values. The regression model aims at describing this variable under various settings (which are included as independent variables). Under all settings, there is a rate of about 5% of observations where no information is transmitted, this is what I would like to consider in the model. If information is transmitted, the distribution of the entropy-based measure can be approximated well by a Gamma distribution.
I agree with @Max_Mantei that what you are describing is basically that your response variable (this entropy measure) has a hurdle_gamma distribution, which is the likelihood not the prior and can be specified via
family = hurdle_gamma().
Thank you for your help, I think this solved my problem. I thought about your proposal and agree that it would make sense to integrate this information via the likelihood instead of defining a Gamma hurdle prior for the intercept. I think I was trying to incorporate my expectation on the rate of 0’s (i.e., 5%) directly into the model, but doing this by a gamma hurdle prior for the intercept would be incorrect. I still think that this expectation should be somehow reflected in the prior for the model parameters, but my results obtained with a noninformative prior seem still valid and plausible.
If I’m not mistaken, you can model the hurdle part specifically using
hu ~ where you can incorporate your prior information.