I am very new to bayesian statistics, and trying to make sense of it.
I have run the arm::bayesglm for bayesian logistic regression using the default priors, but I’m so clueless on how to interpret the output.
What are the coef.est and coef.se?
Do you exponentiate them just like in the classical logistic regression?
How do you present them in papers?
I am using this method as the maximum likelihood approach gives inflated point estimates due to complete separation. I have referred to this thread on how to deal with separation in logistic regression.
The image below is an example of my output. I have a variable called vintage which is the transplant vintage of a patient (containing 4 levels: <12 months, 1-5 years, 5-10 years and >10 years) and we wish to see if this variable is associated with the outcome, i.e. vaccine acceptance. Can I interpret it as follows:
Patients with transplant vintage of 1-5 years, 5-10 years and >10 years are 3.22, 2.36, 1.49 times more likely to be vaccine-accepting than in patients with <12 months of transplantation, respectively.
How do I incorporate the SE? Should I calculate the credible interval? I should avoid using the terms odds or odds ratio right?
Would be great if you could recommend papers that use arm::bayesglm, so that I can have a sense of how people write and present their results?
Hi,
unfortunately, I don’t think we can provide good support for arm here, while the some of the authors of the package (notably @andrewgelman ) are active members of the Stan community, the package is not really a part of the Stan ecosystem (and does not appear to be very actively developed). I would also guess that the approximate methods from arm have been superseded by the more flexible and reliable MCMC methods available via rstanarm (which we happily support here :-) )
In any case, I don’t think Bayesian approach lets you ignore complete separation - it is still an issue and the tail of your coefficients will be influenced by your prior choices. Additionally, if I see correcly, you have just 80 observations, which is almost certainly not enough to learn anything useful about such a large number of predictors. You might also be better served by treating the vintage value as continuous and fitting a smooth term to it (that would require brms with something like Take_4lvl ~ s(vintage) + ...) as you are discarding a lot of information by binning the value.
Hope that helps at least a bit!
P.S. Note that you can provide output and code listings as formatted text (surrounded by triple backticks ```) which helps readabilty and usability
Hi, I agree with the others that we now recommend stan_glm() in the rstanarm package rather than bayesglm. We have lots of examples of stan_glm() in our recent book, Regression and Other Stories.