New paper: To select or not: predictively consistent priors instead of model selection

New preprint To select or not to select: predictively consistent priors instead of model selection with @annariha, Leevi Lindgren, @davkoh, @paul.buerkner and me. arXiv.2606.22850

tl;dr: Model selection is not a substitute for building good models in the first place.

Abstract: Bayesian modelling workflows often consider multiple candidate models of varying complexity. Model selection is commonly used to navigate potential trade-offs between model complexity and generalisability to new data. We study when model selection is unnecessary or can even be harmful for predictive performance in finite data regimes and find that the need for selecting simpler models can depend on prior choice. We formalise predictively consistent priors, which keep prior predictive implications stable as model complexity increases. Across examples and numerical experiments, including adding covariates in linear and logistic regression, forward variable selection, and nonlinear modelling, flexible models with predictively consistent priors typically match or outperform selected simpler models in out-of-sample predictive performance. When selection helps, it can indicate poor joint prior implications, such as excessive prior mass on implausible predictive values. Based on our findings, we propose replacing the notion of sparsity or parsimony at the level of model components with specifying priors that remain sensible in predictive space as models become more complex.

These ideas have been around, but there was no single easy paper to refer to explaining and illustrating some important aspects of model selection. Sure, model selection can reduce overfitting, but even better is to use big models and predictively consistent priors.

This is a long (76 pages) slow science paper. I had been showing variants of some plots in my talks years ago, but polishing the explanations and adding more theory took a long time. Anna, Leevi, David, and Paul all did great work on this.

7 Likes