Thank you very much for your input, @Bob_Carpenter !
I truly appreciate you taking the time to educate someone like me who is trying to learn and use Bayesian modeling but is having difficulty discussing and asking for help.
I sincerely apologize for my delayed response. I’ve been reading the two books you mentioned, ‘BDA3’ and ‘Statistical Rethinking,’ trying to better understand your response. (Honestly, BDA3 was too advanced for my current knowledge level, although many sections seemed very helpful. I plan to revisit the book in the future to deepen my understanding. And I am still doing Statistical Rethinking.)
This question made me think more deeply about what puzzles me. My paper was once rejected by a reviewer who commented that the manuscript lacked p-values (among other comments). The Bayesian approach isn’t common in my field.
Also, Studies typically include multiple predictors, such as ‘intervention status,’ ‘test timing,’ ‘testing formats,’ ‘age,’ ‘gender,’ ‘background,’ ‘context,’ and generally the interactions between these predictors. I’ve been unconsciously thinking, “It would be easier and more straightforward to say, ‘The intervention significantly enhanced posttest scores, and there was a significant interaction between intervention and gender. Meanwhile, no effects were observed for test timing and formats.’”
Given this multiple-predictor practice, I find it challenging to thoroughly discuss the posterior distribution within space constraints. I would appreciate learning any better strategies for handling such situations.
Thank you for your clear suggestion!
Following your suggestion, would my report on the estimated coefficient of ‘intervention’ look something like this:
The estimated Odds Ratio for ‘intervention’ was 1.20, 95% CrI [0.90, 1.50], 90% CrI [1.01, 1.41], 50% CrI [1.10, 1.30].
The estimated OR for ‘gender’ (reference = Male) was 0.9, 95% CrI [0.40, 2.5], 90% CrI [0.90, 1.80], 50% CrI [0.85, 0.95].
In this case, can I interpret these results as follows: the intervention increased the OR by a factor of 1.2 (compared to no intervention), and we are 90% confident that the effect is meaningfully positive because we are 90% confident that the OR exceeds 1.
Regarding gender, females scored 10% less accurately than males, and we are 50% confident that females’ mean test scores were lower than males. However, we are 90% confident that the OR can be positive, indicating that there is a good chance that males can be less accurate than females.
I’m not entirely confident about my wording or clarity, but am I on the right track?
How would you interpret the above results and describe your interpretation?
When studying frequentist statistics, I learned that the alpha level should be set before the study and shouldn’t be changed after obtaining results (hence the conventional p < 0.05). When evaluating predictors’ usefulness based on various credible intervals (e.g., 50%, 80%, 90%, 99%), I worry that I’m changing the judgment criteria, which feels incorrect. Given the interpretation of Bayesian confidence intervals, would this approach be acceptable?
Additionally, in BDA3, posterior distributions often seemed to be summarized with 50% and 95% Credible Intervals. For example, Figure 16.6 Anova display for two logistic regression models of the probability that a survey respondent prefers the Republican candidate for the 1988 U.S. presidential election, based on data from seven CBS News polls
”50% intervals, and 95% intervals of the finite-population standard deviations sm.” However, their interpretation of the figure was more casually interpreted instead of going on each variable individually. Is this something you would be recommending? Or is this just a casual explanation for the sake of illustration in the book, and would you rather recommend something more detailed??