Feature selection via stan

JensDotK · May 8, 2017, 9:03am

I’m not sure whether my topic fits into category “modeling”.
Coming from a frequentist background, I want to shift to Bayesian data analysis

I have like 20 predictors (dichotomous, categorial, numeric) and one binary outcome variable (problem existent/ problem not existent).

In this matter, I am looking for an adequate Bayesian framework to do feature selection before fitting final model with most relevant predictors. Am I even on right track thinking this way?

Is such an approach implemented in rstan? Can you share some references one should read when it comes to Bayesian feature selection?

Best wishes and thanks in advance
Jens

sakrejda · May 8, 2017, 12:16pm

This is one of those topics that has many answers, including “don’t do that”. The procedure you are suggesting where you drop the less relevant predictors and refit the model on the same data set is likely to exaggerate the effect of the predictors you keep. If you’re just doing prediction and don’t care about why it works that might be fine and comparing predictive error would be the way to go.

Or you could try some of the LASSO/Horseshoue priors for your effects, Aki has a nice paper (http://ceur-ws.org/Vol-1218/bmaw2014_paper_8.pdf) about that. I’m not sure if rstanarm has that kind of model yet (?) but there are a few implementations floating around the users list.

JensDotK · May 8, 2017, 1:37pm

Thanks for reply and for paper.

Pretty general question, but: Should I be worried about overfitting in context of Bayes analysis due to adding too many predictors to model?

sakrejda · May 8, 2017, 1:51pm

My general advice is that you should only add predictors you have reason to believe will matter. If you’re adding a lot of predictors you’re not sure about you’ll need some outside data to confirm what you find anyway so just enjoy the fishing expedition for what it is. Most studies I can think of in that context are things like GWAS where after you get your genes somebody is going to pick a bunch of them to investigate further so what you really need is a ranking not a cutoff.

avehtari · May 8, 2017, 9:20pm

Hi,

Should I be worried about overfitting in context of Bayes analysis due to adding too many predictors to model?

You can get overfitting if your model is badly misspecified, e.g. using thin tailed observation model in case of thick tailed data distribution or using a bad prior for predictor weights, but if you use good models and priors then there is no such thing as too many predictors (although there are computational limits).

See the following paper (and references therein) how to set a prior in case you have much more predictors than observations:

Juho Piironen and Aki Vehtari (2017). On the Hyperprior Choice for the Global Shrinkage Parameter in the Horseshoe Prior. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:905-913. On the Hyperprior Choice for the Global Shrinkage Parameter in the Horseshoe Prior

The paper has Stan code, and rstanarm and brms packages have support for easily defining these priors.

In this matter, I am looking for an adequate Bayesian framework to
do feature selection before fitting final model with most relevant
predictors.

See the following paper, which illustrates what happens if you “do feature selection before fitting final model with most relevant predictors.” The paper also describes the projection predictive approach which uses decision theory to do the correct thing and is able to do the selection and important part of the information in the full model.

Juho Piironen and Aki Vehtari (2017). Comparison of Bayesian
predictive methods for model selection. Statistics and Computing,
27(3):711-735. doi:10.1007/s11222-016-9649-y. Comparison of Bayesian predictive methods for model selection | Statistics and Computing
The code is available at GitHub - stan-dev/projpred: Projection predictive variable selection

Aki

JensDotK · May 15, 2017, 8:30am

Thanks a lot for this input! :-)

I also would like to annotate that usage of brms package is really helpful!

Topic		Replies	Views
Improving Performance on Logistic Regression with Informative Priors Modeling performance , rstanarm	4	1545	May 1, 2020
Logistic model selection in rstan Modeling rstan	8	880	April 6, 2023
Multiple Variable Selection in Stan Modeling	6	3014	February 16, 2019
Genetic association studies in Stan General	6	1032	May 3, 2018
Model selection as a decision (without focus on prediction) Modeling loo , bioinformatics	16	2132	February 1, 2018

Feature selection via stan

Related topics