Projpred error

I ran a model for 144 features for 749 sample size (N=749). I fit the model using rstanarm.

D <- ncol(df3[, 15:158])

 
 # Prior guess 
 p0 <- 5
 # Number of observations:
 N <- nrow(df3)

 tau0 <- p0 / (D - p0) * 1 / sqrt(N)
 
  predictor_names <- colnames(df3)[15:158]
 y<-df3$age_nmr
 # Create the formula dynamically
 formula <- reformulate(predictor_names, response = "y")
 
 # Fit the model with the generated formula
 refm_fit <- stan_glm(
   formula = formula,       
   family = gaussian(),
   data = df3,              
   prior = hs(global_scale = tau0),
   chains = 4, iter = 4000,
   QR = TRUE, refresh = 0
 )

refm_obj <- get_refmodel(refm_fit)

cvvs_fast <- cv_varsel(
   refm_obj,
   validate_search = TRUE,
   method = "L1",
   refit_prj = FALSE,
   nterms_max = 144,
   verbose = TRUE
 )

Every thing ran ok but with the below syntax I got error:

cv_fits <- run_cvfun(
   refm_obj,
   K = 10
    )

Error in `[.data.frame`(d, , all_vars, drop = FALSE) : 
  undefined columns selected

I appreciate any guidance.

Hi Shabnam,

This sounds a bit like rstanarm issue #551, so I would try to write out the formula explicitly (will be very long in your case) or at least avoid reformulate() and try to use dot notation (e.g., y ~ .), but I’m not sure if this will actually help (given that rstanarm issue). Alternatively, it could be something like Error message from projpred with binomial regression model fitted in rstanarm, so do you have NAs in your dataset?

In any case, I currently think this is related to rstanarm, so using brms or something else should avoid this.

2 Likes

@fweber144 Hi thanks for your response. There is no NA in the data and this is Gaussian family not binomial.

Thanks. Could you try with brms?

This does seem like it could be related to this issue, which I see now is many years old and I never saw it (sorry @fweber144!). @ssalimi Are you on Windows? I’m pretty sure that particular rstanarm issue should only affect Windows users.

1 Like

@jonah Yes, I use windows. Shall I use brms instead?

It would at least be helpful to know if the issue disappears using brms (brms uses a different package to handle the parallelization). It could potentially be a different issue than the one @fweber144 pointed out, but I’m guessing he’s right. Either way, we need to fix that bug in rstanarm (although as someone who doesn’t use Windows I’m not exactly sure what the fix is yet). Sorry for the hassle!

2 Likes

@fweber144 @fweber144 using brms solved the issue. Thanks for your guidance.

2 Likes