I am a PhD student using logistic regression to investigate mental health epidemiology. Since participants in my cohort study have a diagnosis or not (coded 1 or 0) - I’m using logistic regression to estimate the assocation of mental health disorders with some categorical exposures
Reading Gelman et al. (2008), I understand one approach to Bayesian logistic regression (not hierarchical) is to standardize the input variables. They say, scale variables before setting priors by doing the following:
- “Binary inputs shifted to have a mean of 0 and to differ by 1 in their lower and upper conditions. (For example, if a population is 10% African-American and 90% other, we would define the centered “African-American” variable to take on the values 0.9 and −0.1.)”
- “Other inputs are shifted to have a mean of 0 and scaled to have a standard deviation of 0.5. This scaling puts continuous variables on the same scale as symmetric binary inputs (which, taking on the values ±0.5, have standard deviation 0.5).”
Once data is scaled in this way, Gelman et al. (2008) and Gelman again in Stan Prior Choice Guidance recomend using a (scaled) Student’s t distribution with 3<\nu<7 as a weakly informative prior for the coefficients in the predictor.
BUT what if you have a multi-category variable: say ethnicity? Let me illustrate with an example:
What if I have the following ethnicities (UK-context, and by no means a full list of ethnicities!):
- White
- Black-Caribbean
- Black-African
- South-Asian
- East-Asian
- Other
Now say I wanted to look at ethnicity as a predictor of the presence a mental health disorder (a dichotomous variable: 1 or 0). I would normally fit a model using dummy variables for
log(\frac{p_i}{1-p_i}) = \beta_0 + \beta_1x_{1i} + \beta_2x_{2i} + \beta_3x_{3i} + \beta_4x_{4i} + \beta_5x_{5i}
where x_1 is whether you are Black-Caribbean (1) or not (0) … up to x_5 (East-Asian or not) and x_6 other or not WITH the references category being white. This makes sense in terms of the research question as we are comparing mental health of minority ethnicities with the dominant ethnicity.
This is where I get stuck. If I change my binary/dummy variables for each ethnicity by shifting the variables (90% ones and 10% zeros would be shifted to take on the values 0.9 and −0.1). When I have done this scaling - this changes the value of the intercept. Previous the intercept was the logit (logit(.)) of the probability p that white subjects had a mental health disorder. Since the output coefficient estimates variables can only be interpreted as the ratio of log-odds as compared to white participants - I am confused about what the benefit is of scaling? Below is the STAN code for the simplest possible version of this model. There are K predictors which represents P-1 categories.
To summarize:
Scaling the variables by shifting them as Gelman et al. (2008) descibes changes ths intercept. How is this intercept now interpreted?
Should this method forscaling variables be used multi-category variables (so that a weakly informative prior can be set on them all)?
If this is not suitable, are there any other guides/references on how to set weakly informative priors for categories with more than two factors?
Any ideas about how weakly informative priors are extended to hierarchical logistic regression models?
data {
  int<lower=0> N;
  int<lower=0> K; //number of fixed effect predictors (inc intercept)
  matrix[N, K] X_mat;
  int<lower=0,upper=1> y[N];
}
parameters {
  //FE coeffiecents in mean function for y_repeat
  vector[K] beta;
}
model {
  //priors
  beta ~ student_t(5, 0, 1);
  //binomial likelihood
  y ~ bernoulli_logit(X_mat * beta);
}
****