I have been tasked with making a predictive model for a set of ~80 observations, each of which has ~250 associated measurements.

I have skimmed over Using stacking to average Bayesian predictive distributions. If I understand correctly, this technique takes a group of already-estimated models and finds the optimal weighing for their predictions.

On the other hand, On the Hyperprior Choice for the Global Shrinkage Parameter in the Horseshoe Prior provides another path to optimal prediction by restricting the number of variables that are allowed to have an influence.

Now, I could in principle fit several thousand models on different subsets and transformations of my variables and then let model averaging aggregate them howevers it sees fit. On the other hand, I could run a single model with the full set of variables while applying shrinkage. Or I could apply shrinkage to each submodel being estimated before averaging.

I believe model averaging might still be relevant if I wish to consider different error structures and other such variations not readily accommodated by a shrinkage paradigm. But sticking to this more “variable selection”-y situation, I would like to know what is the more reasonable approach (perhaps there are others I have not considered?).