Horseshoe prior on distributional parameters

I’m looking for guidance on prior selection for distributional parameters for the Beta() family. I have multiple outcomes to predict (more than 50), varying from n = 200 to 40,000, and bounded on (0,1), so I’m using a Beta family. For each outcome there are 300 minimally correlated predictors, and they are a “black box” where only some are expected to be relevant for each outcome. So this seems like a good candidate for a horseshoe prior. But, there is also good reason to think that both the mean and phi are going to change with the predictors.

I’ve tried fitting some models on a couple of outcomes with smaller n, with default settings in brms for horseshoe priors, on all 300 predictors for both the mean and phi, and have gotten encouraging results, posterior means have higher accuracy in terms of RMSE than point estimates from both random forests and frequentist lasso, on several held-out validation datasets, plus nice interpretable and useful posteriors.

I’m wondering about what effect choices other than the default for the horseshoe prior in brms might have, as well as justifying the choice, as well as what is expected, if we even know, when a horseshoe prior is used on both parameters in the beta distribution. I understand that one possible way of choosing the prior is the expected ratio of zero to non-zero coefficients. I also know that there is a limit to how many parameters I can reasonably estimate from small data, so choosing more shrinkage for the smaller datasets is probably wise, but how might this work with horseshoe priors on both the mean and phi? What would be a sensible ratio of zero to nonzero coefficients to set, depending on the size of the data?

  • Operating System: Windows
  • brms Version: 2.8

It’s great to hear the the horseshoe priors worked out so well in your example. I am not an expert myself in setting hyperparaeters of the horseshoe prior but perhaps @avehtai has some ideas.