Dear all, I would like to run a Bayesian logistic regression model using Jeffreys priors for the regression coefficients using the R/brms package. How do I do that?
Kr, Shiny
Dear all, I would like to run a Bayesian logistic regression model using Jeffreys priors for the regression coefficients using the R/brms package. How do I do that?
Kr, Shiny
Hi, @shinyrhino. Stan’s not set up with built-in Jeffreys priors as far as I know, because we recommend using at least weakly informative priors in most cases. If you want to use a Jeffrey’s prior, you will have to figure out what the parameters are and specify your own prior parameters for brms.
With multiple variables, things get a lot trickier. See: Jeffreys prior - Wikipedia.
Dear @Bob_Carpenter, thanks for the advice! Assuming that there are multiple covariates of various scale levels (categorical, continuous), measurement units/coding, and distributions, I thought a scale-invariant, non-informative prior like Jeffreys would be an elegant solution. Following your recommendation of using weakly-informative priors (per regression coefficient), is there a recommended approach so the varying priors are equally weakly informative for the regression coefficient of each covariate?
It’s definitely elegant, but the resulting priors can be really difficult in cases where there’s not much data. Also, if you have multiple parameters, you need to work out the Fisher information matrix to define a proper multivariate Jeffreys prior.
What our more sophisticated regression packages like brms and rstanarm do, I believe, is apply a QR decomposition to the covariates, work in the reduced Q space, then reconstruct back to the original scale. Then a simple prior works well, but we tend to use ones that are weakly informative rather than purely scale-invariant.
There’s a lot more discussion in the Bayesian Workflow paper
and associated book draft.
Generally, I wouldn’t worry so much about the prior independently, but think about the combination of (a) data, (b) likelihood, and (c) prior—they can only really be understood together as all three determine the shape of the posterior.
Dear @Bob_Carpenter, thanks for your kind input and the reading tip! Following Gelman et al. (2008) who stated that most regression coefficients for logistic regression are between -5 and +5, a simple normal prior N(m=0,sd=100) would only allocate about 4% (pnorm(5,0,100)-pnorm(-5,0,100)) of the probability of this generic prior to the typical range of coefficients, and would therefore be reasonable weakly uninformative for a broad range of covariates. Too simplistic?
Kr, Shiny
Ref:
Gelman, A., Jakulin, A., Pittau, M. G., & Su, Y.-S. (2008). A weakly informative default prior distribution for logistic and other regression models. The Annals of Applied Statistics, 2(4), 1360–1383. A weakly informative default prior distribution for logistic and other regression models
You have to be careful about that. Aleks and Andrew wound up recommending a Cauchy prior which Andrew has largely disavowed. And it really depended on standardizing the regression predictors. Otherwise, the regression coefficients completely depend on the problem. For instance, if I give you distances in angstroms or in light years, the regression coefficients change scale—so you have to standardize everything (convert covariates to be centered with unit scale, i.e., standardized). Even better, apply a QR decomposition and work with the orthogonal, unit-scaled Q matrix.
Also, you’re not dealing with “most regressions”, you have a specific regression about which you probably know at least a little bit or you won’t be able to assess whether your model’s working for whatever purpose it’s intended.
I’d recommend something like normal(0, 5)
if you expect the regression coefficients to be on the order of 5 or smaller.
@Bob_Carpenter and for all interested: The SAS manuals state that Jeffreys priors are available for PROC GENMOD and can be constructed for PROC MCMC:
"While Jeffreys’ prior provides a general recipe for obtaining noninformative priors, it has some
shortcomings: the prior is improper for many models, and it can lead to improper posterior in some cases; and the prior can be cumbersome to use in high dimensions. PROC GENMOD calculates Jeffreys’ prior automatically for any generalized linear model. You can set it as your prior density for the coefficient parameters, and it does not lead to improper posteriors. You can construct Jeffreys’ prior for a variety of statistical models in the MCMC procedure. See the section “Example 80.4: Logistic Regression Model with Jeffreys’ Prior” on page 6256 in Chapter 80, “The MCMC Procedure,” for an example. PROC MCMC does not guarantee that the corresponding posterior distribution is proper, and you need to exercise extra caution
in this case. " [1]
See the section “Example 80.4: Logistic Regression Model with Jeffreys’ Prior” on page 6256 in Chapter 80 of the PROC MCMC manual [2].
Refs:
[1] https://documentation.sas.com/api/collections/pgmsascdc/9.4_3.5/docsets/statug/content/introbayes.pdf
[2] https://documentation.sas.com/api/collections/pgmsascdc/9.4_3.5/docsets/statug/content/mcmc.pdf
@Bob_Carpenter One more thought: Would be possible to use a Jeffreys Prior in a multivariable regression model and get around the problem of specifying a multivariate prior by by constructing a hierarchical prior where the parameters for the various Jeffreys priors per coefficient are drawn from a shared, overall Jeffreys prior?
Kr, Shiny
If you have so little data that you have to resort to priors like this, you probably have bigger problems in your modeling. But to answer the question, yes, you can set up whatever kinds of priors you want. The hyperpriors are not usually very sensitive.