Projpred interpretation

ViktorVdV · May 15, 2026, 9:20am

Hi all,

I am using projpred to reduce my predictor set. One of my colleagues raised a concern that I would like to get some expert opinion on.

The concern is about correlated predictors and the risk of over-interpreting selected variables mechanistically or causally. In a previous PCA-based workflow, shared information among correlated predictors was distributed across components, which naturally encouraged cautious interpretation. In contrast, projection predictive selection returns a sparse model with individual predictors, which can create the impression that the selected variable is uniquely important, even when several correlated predictors may carry largely interchangeable predictive information.

More specifically, if two predictors are strongly correlated, the fact that projpred selects one over the other could partly reflect the forward search procedure and predictive redundancy, rather than evidence for a uniquely causal or mechanistic role. A reviewer could argue that replacing the selected predictor with a correlated alternative might yield very similar predictive performance while suggesting a different interpretation.

Therefore, I am interested in how users of projpred justify the interpretation of selected predictors in the presence of substantial collinearity, and whether there are recommended ways to frame or supplement the analysis to avoid overstatement. I do start from a broader predictor set, all of which may have a mechanistic link to the response.

Thanks in advance for any orientations!

fweber144 · May 15, 2026, 8:32pm

Hi @ViktorVdV,

Projection predictive feature selection was developed for prediction tasks (see, e.g., Vehtari and Ojanen, 2012, and Piironen et al., 2020). Thus, it aims at minimal subset feature selection problems, not complete feature selection problems (this is nicely explained in Piironen et al., 2020, but also in Pavone et al., 2022, for example). Hence, projpred does not try to find all predictors related to the outcome, but only a subset of the predictors that is as small as possible but still achieves a predictive performance that is as good as possible.

Does that help?

ViktorVdV · May 16, 2026, 11:42am

Thank you, this is indeed very helpful.

Topic		Replies	Views
Not good results with projpred Modeling projpred	4	766	July 2, 2021
Projpred elpd goes down and RMSE goes up after x variables Modeling loo	12	1214	February 21, 2020
Can I use projpred for this exploratory approach? General projpred	2	470	June 2, 2021
Projpred 1.0.0 and paper "Projective Inference in High-dimensional Problems: Prediction and Feature Selection" Publicity	0	635	October 10, 2018
Correlated predictors vs fewer predictors with small dataset? Modeling specification , r , brms	8	762	October 3, 2022

Projpred interpretation

Related topics