Priors set up: Combine horseshoe prior with knowledge about noise in response variable

That is a challenging daa

You need to use init_refmodel() function. See Reference model and more general information — refmodel-init-get • projpred and full example code using spca (from the paper Using reference models in variable selection | Computational Statistics)
https://github.com/fpavone/ref-approach-paper/blob/a15f821d76a05d6934b672865332a643a53ac8dd/code/minimal_subset.R

With that many variables you want to use either method='L1' or if using method='forward' limit the search e.g. with nterms_max=20. Try first with validate_search=FALSE just to test how much time one search path takes (see more in [2306.15581] Robust and efficient projection predictive inference). And if that works, then rerun with validate_search=TRUE and possibly with cv_method='kfold', K=10.

You may also try a simpler and faster approach like loc.fdr combined with the reference model (see Section 4.2 in Using reference models in variable selection | Computational Statistics)

1 Like