In what way are Gaussian priors "not robust" for logistic regression coefficients?

blokeman · November 12, 2023, 1:04pm

Ye Official Prior Choice Recommendations dictate that when specifying weakly informative priors for logistic regression coefficients, “[n]ormal distribution is not recommended as a weakly informative prior, because it is not robust”. Student-t distributions with df from 3 to 7 are recommended instead.

Then it goes on to say that the Gaussian distribution is fine if the prior is intended to be informative rather than just weakly so. This is fairly mystifying to me as well.

I do understand how the Gaussian distribution is “less robust” than a student-t one when y is continuous. For example, if you’re using a Gaussian distribution to model male heights in centimeters (with true values e.g. of \mu = 180, \sigma = 6) and your sample includes an outlier suffering severe gigantism with a height of 300cm, then a Gaussian model will shift the estimated location and scale much farther up in response to the outlier than a student-t model, which will merely thicken the tail while leaving the estimated location and scale nearly unaffected.

But I don’t understand how this applies to prior specifications for logistic regression coefficients. When I fit binomial models to mock data with a wide range of different logits corresponding to the \beta's, a Gaussian prior with a given Scale always imposes more shrinkage on large logits than a student-t prior with the same Scale. Thus, I don’t see in what sense the student-t prior can be regarded as “more robust” for these parameters. The mind boggles.

jsocolar · November 13, 2023, 2:11pm

Ultimately, this is a semantic question about what “robust” means. Presumably, calling a normal prior “not robust” as a choice for a weakly informative prior reflects an idea that Normal priors tend to be too informative if what is desired is a weakly informative prior. They strongly suppress large coefficient values–perhaps more so than the modeler intends. They are not robust prior choices in the event that the true coefficient is quite large. You’re right; normal priors shrink large values more aggressively, which means that they might not be a robust choice of weakly informative prior model, which is generally intended not to shrink values too aggressively.

On the other hand, if the prior is intended to be informative, then it’s easy to imagine using the word “robust” to mean the exact opposite, as in normal priors are robust to atypical datasets and provide sufficient regularization of the coefficients whereas t priors do not. Whether or not that italicized statement is true depends entirely on which distribution (normal or t-with-finite-df) accurately captures the prior information that the modeler wishes to encode.

The recommendations are for the weakly informative case, and the point is that when we say “weakly informative”, we usually mean something with more mass in the tails than Gaussian.

Topic		Replies	Views
Default priors for logistic regression coefficients in brms brms	3	2908	May 7, 2020
Jeffreys prior for regression coefficients of a Bayesian logistic regression model brms specification	8	286	November 6, 2024
Robust conditional logistic regression with rstanarm::stan_clogit? rstanarm techniques	2	886	May 4, 2020
Prior recommendation for scale parameters in hierarchical models too strong? Modeling	25	8231	January 31, 2018
Choosing informative priors for Bayesian ordered logistic regression Modeling	2	1370	February 22, 2019

In what way are Gaussian priors "not robust" for logistic regression coefficients?

Related topics