Variable selection (varsel/cv_varsel) in models that include spline functions

Dear all,

Relationships between some of the continuous predictors and the binary outcome in my data are known (or suspected) to be nonlinear. So far, I have been using restricted cubic splines (function rcs() from the ‘rms’ package), natural splines (function ns() from ‘splines’), or other spline or polynomial functions to relax the linearity assumption and allow consideration of possible nonlinear shapes of these relationships.

When using rstanarm’s function stan_glm(), models including rcs() spline terms seem to work well (regardless of whether I specify exact knot positions or just the number of knots). So, for instance, models specified as follows seem to run okay:

model ← stan_glm(outcome ~ sex + age + rcs(age, 4) + … rcs(speed, 4), data = data, family = binomial(link = “logit”), prior = hs_prior, seed = 12345)
model ← stan_glm(outcome ~ sex + age + rcs(age, c(5, 10, 15, 20)) + … rcs(speed, c(5, 10, 15, 20)), data = data, family = binomial(link = “logit”), prior = hs_prior, seed = 12345)

However, when I subsequently try to perform projection predictive variable selection using the varsel() and cv_varsel() functions, I receive the following error message, and the variable selection does not run.

model_varsel ← varsel(model)
Error in out$smooth.spec[[1]] : subscript out of bounds
model_cv_varsel ← cv_varsel(model, method = ‘forward’, cv_method = ‘LOO’, nloo = n)
Error in out$smooth.spec[[1]] : subscript out of bounds

Please, what does this error message mean? And most importantly, please, is there any way I can make varsel()/cv_varsel() work with models specified as above?

I use R version 4.1.0 on macOS Catalina 10.15.7. Package versions (possibly relevant to my enquiry): bayesplot (1.8.0); brms (2.15.0); Hmisc (4.5-0); loo (2.4.1); projpred (2.0.2); rms (6.2-0); rstan (2.21.2); rstanarm (2.21.2); rstantools (2.1.1); StanHeaders (2.21.0-7).

Thank you very much for your advice.

Best wishes,

Tom

Hi Tomas, welcome to the Stan Forums!

I don’t think projpred supports rms::rcs() or splines::ns() spline terms. At least according to this line, it seems like projpred only supports mgcv::s() and mgcv::t2() spline terms.

Hi Frank,

Thanks very much for your welcome and helpful reply! I will use some of the functions you suggest.

Best wishes,

Tom

tagging @AlejandroCatalina so that he knows the issue. I assume at the moment we will not add support for these, but maybe the documentation could mention the limitation.

Yes indeed we don’t support other spline terms at the moment. We’ll make sure to add some documentation about that.

Thanks for reporting!

1 Like

Hi Aki and Alejandro,

Thanks for your messages. Also, many thanks for the great packages!

Best wishes,

Tom

1 Like