Variable selection with studenT(3,0,1) for prior

I’m trying to estimate the contribution of genomics (with >100k variables) on a phenotype, where I put horseshoe prior on the coeffecients with gloable shrinkage parameter \tau and local shrinkage parameters \lambda. While I notice that StudentT(3,0,1) was suggested, my priors were set as follows:

\tau\sim N(0,1)\\ \lambda\sim StudentT(3,0,1)

With half-cauchy priors for standard horseshoe priors, \kappa=1/(1+\lambda^2) \sim Beta(1/2,1/2) was always used as a criteria for variable selection. However, could anyone give me some suggestions on how should I decide what variable should be selected under my prior settings?

With >100k variables you probably have ultra sparse case StudentT(3,0,1) is not very sparse. It would be better to use Cauchy (StudentT(1,0,1)), or horseshoe+ might be even more suitable prior. As sampling with Horseshoe(+) and >100k variables is going to be very slow, you might also consider using SPCA based linear mode and projection predictive variable selection as shown in Projective inference in high-dimensional problems: Prediction and feature selection

1 Like