I’m confused about different approaches people use for Bayesian model selection.
I understand the frequentist approach is generally to fit the most complex model first with all predictors and their hypothesized interactions, then run subsequent models removing one term at a time, and use something like a likelihood ratio tests to compare models and select the simplest model that does not significantly reduce model fit.
I’ve seen numerous papers using Bayesian models do this in the opposite way, where each predictor is fitted first on it’s own, then only significant predictors (credible intervals that don’t include 0) are fitted in the next round.
I’m struggling to find information about what the benefit of starting simple with individual predictors is, other than it being easier to get models to converge. Wouldn’t you risk throwing away important predictors that might be non-significant on their own, but be significant when included as an interaction with another variable? Am I missing something, or if I’m able to get my most complex model to converge, would it be preferable to use the method of starting complex and dropping terms?