Help with lasso example in brms

#1

I normally use `glmnet` for variable selection tutorial here. The brms documentation says that there is a `lasso` function, but I am struggling to get a working example. I get the error.

Error: Defining priors for single population-level parametersis not allowed when using horseshoe or lasso priors(except for the Intercept).

Could someone show a simple working example of variables selection using `lasso` with `brms`?

#2

Please provide the code you want to get working. Also, I suggest using the `horseshoe` prior rather than `lasso`, since the former provides much better shrinkage.

#3

This is a data set from â€śAn Introduction To Statistical Learningâ€ť. It should be reproducible and relevant to variable selection via lasso.

``````library(ISLR)
library(tidyverse)
library(brms)

hitters <- Hitters %>% na.omit()

for_lasso <- brm(Salary ~ .,data = hitters)

summary(for_lasso)``````

#5

You can set a lasso prior as follows:

`for_lasso <- brm(Salary ~ .,data = hitters, prior = prior(lasso(), class = "b"))`

#6

How can I be more aggressive or less aggressive with setting coefficients equal to 0? I was assuming that `df` was the argument for this, but maybe I am wrong. I am seeing that none of the following models have covariates getting set to 0.

Here is a more complete example:

``````library(ISLR)
library(tidyverse)
library(brms)

# define function to scale variables
my_scale <- function(...) as.numeric(scale(...))
hitters <- Hitters %>%
na.omit() %>%
# remove non-numaerics before scaling
select(-NewLeague,-League ,-Division ) %>%
# scale
mutate_all(my_scale)

for_lasso1 <- brm(Salary ~ .,data = hitters, prior = prior(lasso(df = 1), class = "b"),
iter = 500, chains = 3)
for_lasso2 <- brm(Salary ~ .,data = hitters, prior = prior(lasso(df = 10), class = "b"),
iter = 500, chains = 3)
for_lasso3 <- brm(Salary ~ .,data = hitters, prior = prior(lasso(df = 100), class = "b"),
iter = 500, chains = 3)
summary(for_lasso1)
summary(for_lasso2)
summary(for_lasso3)``````

#7

Thatâ€™s because you are in a Bayesian framework. There is no absolut shrinkage to zero. See the paper about the Bayesian lasso I cite in the doc of ?lasso.

In fact, the lasso prior is a bad shrinkage prior. I rather suggest using the horseshoe prior instead.

#8

This is the code with the horseshoe priors. After glancing at the paper it seems as if the Bayesian lasso is a compromise between lasso and ridge, but as you mentioned the coefficients donâ€™t shrink to 0. In the paper they also used double-exponential.

What is the justification of the horseshoe prior?

Also, is it true that the smaller the `df` the more regularization with `df = 1` being the most regularized?

``````   for_lasso1 <- brm(Salary ~ .,data = hitters, prior = prior(horseshoe(df = 1), class = "b"),
iter = 500, chains = 3)
for_lasso2 <- brm(Salary ~ .,data = hitters, prior = prior(horseshoe(df = 10), class = "b"),
iter = 500, chains = 3)
for_lasso3 <- brm(Salary ~ .,data = hitters, prior = prior(horseshoe(df = 100), class = "b"),
iter = 500, chains = 3)
summary(for_lasso1)
summary(for_lasso2)
summary(for_lasso3)``````

#9

https://projecteuclid.org/euclid.ejs/1513306866

I donâ€™t think so. I would say that the regularization is mostly due to the expected number of non-zero coefficients. Even still, you are not going to obtain exact zeros, although you can use the ideas in the projpred package to obtain a model with fewer coefficients that is expected to predict future data about as well.

#10

Iâ€™ll add to the Benâ€™s post, for getting coefficients equal to 0 see http://link.springer.com/article/10.1007/s11222-016-9649-y and several examples and video of projpred in https://github.com/avehtari/modelselection_tutorial

#11

And see also Betancourtâ€™s case study comparing â€ślassoâ€ť prior and horseshoe https://betanalpha.github.io/assets/case_studies/bayes_sparse_regression.html