Very basic projection predictive variable selection question


I’m a little confused on what is, exactly, “projection.” Is it when (a) the coefficients from the reference model are used in the smaller model, and then the Kullback-Leibler divergence between the predictions of the reference and smaller model is calculated or (b) the coefficients of the smaller model are solved as to minimize the K-L divergence between the predictions of the reference and smaller model?

Furthermore, if I understand correctly, for a given model size, the chosen variable set is the one that minimizes K-L divergence. But what happens if two variable sets yield very similar K-L values (e.g., with a 3 variable reference model, y ~ A + B + C, it is possible that y ~ A + B and y ~ A + C have similar K-L divergences)? Is there any way to see how the different variable sets rank against each other?

Thank you for answering and thank you to everyone that has been developing the packages and methods.


Welcome to The Stan Forums, Richard!

It’s (b).

The one with smaller KL divergence is chosen. More specifically, which.min() is used. So in the (unlikely) case that both candidate models have identical KL divergence, the first one (according to the order from this object) is used. Note that all this only applies to the forward search, not L1 search.

Unfortunately not, at least currently. But I’ll note this down as a possible extension for the future. If you have time, a corresponding feature request on projpred’s issue tracker would be welcome.


This is a very helpful reply. Thank you.

When doing cross-validation for search paths, we can get different search paths depending on the cross-validation fold. These different search paths are available in the projpred object, and we have used that to visualise similar sets of variables. See, e.g., Markus Paasiniemi’s MSc thesis Methods and Tools for Interpretable Bayesian Variable Selection. Currently there is no good helper function and as the object structure has changed, the old code examples may not work, but you can get that search path variability information if you want.


You’re right, I didn’t think about cross-validated search in this context here. Even though it won’t give you, @RichardF, the KL divergences of all candidate models at a given model size, it can nevertheless be used as a means of judging their importance. For this, you need to use element pct_solution_terms_cv of an object returned by cv_varsel(). As @avehtari mentioned, this element pct_solution_terms_cv definitely needs some improvements in terms of the user interface (see projpred issue #289). Currently, this is kind of a “hidden” feature.