How to do empirical Bayes on factor variables?
Particularly given e.g. variable gender. Then is the fit supposed to be done so that it becomes
intercept + genderMale
with genderFemale baked into intercept.
Or should I do something else?
Further, how a priors defined on such?
For fully Bayesian inference you can treat this in a completely standard way. For example
bf(y ~ intercept + genderMale)
Then define whatever priors you want on the intercept and the coefficient for genderMale.
One little trick that can be useful in this context is to code gender as -1/1 (instead of 0/1). This ensures that the prior pushfoward distributions for males and females will be identical (assuming that you use a zero-centered prior on the coefficient for gender). If you bake genderFemale
into the intercept, then you’ll have a wider prior pushforward for genderMale
than genderFemale
.
If you really want to do empirical Bayes, then do the above, but use your empirical Bayes priors.
I just want to double check (and I hope I’m not being presumptuous here):
You have recently asked a lot of questions about setting discrete priors (e.g. negative binomial, poisson, discrete factor variables in general). I just want to make sure that you aren’t confusing discrete parameters (which do not work in Stan and need to be marginalized out) with discrete covariates (which work just fine in Stan). In the intercept + gender_male
model, for example, gender_male
is a discrete covariate. In a regression context, gender_male
then gets multiplied by a coefficient that gets estimated from the data. This parameter (i.e. the coefficient) is not discrete; it is continuous. It can take any value. You would encode prior information about this coefficient using some appropriate continuous distribution like a Gaussian or a Student t.
4 Likes