Horseshoe regression implementation with brms

I have recently learned about horseshoe regression as an attractive alternative to PLSR for problems with many predictors. In my case I have a hyperspectral dataset with many (over 2k) spectral bands. I have attempted to start fitting a model using brms

set_prior(horseshoe(df = 1, scale_slab=2.5))
fit <- brm(Y~., family = gaussian() , data =data2)

however in addition to spectral information, I have additional information such as sampling day, as well as the varietal of each plant I sampled and the ID of the sensor used to measure the response I’m modeling. I want to include these variables in the model, but have the following questions (forgive me if they are silly questions):

  1. Is there a reason to set different priors on my non spectral covariates? e.g normal(0, 1)

and

  1. Is there a way to include varietal and machine ID as random effects in such a model? Would it be as simple as adding them to the model as if this were a standard linear regression i.e. + (1 | varietal) + ( 1 | machine), or is there a particular principled way to do this in horseshoe regression?

If you don’t think these covariates are as likely to be sparse as the spectral covariates, then you’d want a less strongly regularizing prior.

There’s nothing wrong with what you have here. It encodes the assumption that varietal and machine have non-sparse effects. Note that you would lose the sparsity of the spectral coefficients if you used random slopes for these covariates; in that case you would assume that the means are sparse but that the group-level effects are not.

1 Like

Much appreciated!

I have a follow up question regarding implementation. I’m slightly confused by the nonlinear syntax in brms. In order to have both the random effects and have only a subset of covariates have different priors, is the proper syntax

wavelengths <- as.formula(paste0("a ~ ", paste0("wavelength.", 350:2500, collapse = " + ")))

fit3 <- brm(
  bf(Brix.Value ~ a + b + (1 | Tree.Species) + ( 1 | Machine), nl = TRUE) +
    lf(a ~  wavelengths, center = TRUE) +
    lf(b ~ 0 + Day, cmc = FALSE), 
  prior = 
    set_prior(horseshoe(df = 1, scale_slab=2.5), class = "b", nlpar = "a") +
    set_prior("normal(0,.5)", class = "b", nlpar = "b"),
  family = Beta(link = "logit"), data =data2,
  iter = 4000, save_pars = save_pars(all = TRUE),
  backend = "cmdstanr")

Or

wavelengths <- as.formula(paste0("a ~ ", paste0("wavelength.", 350:2500, collapse = " + "), "+(1 | Tree.Species) + ( 1 | Machine)"))

fit3 <- brm(
  bf(Brix.Value ~ a + b, nl = TRUE) +
    lf(a ~  wavelengths, center = TRUE) +
    lf(b ~ 0 + Day + (1 | Tree.Species) + ( 1 | Machine), cmc = FALSE), 
  prior = 
    set_prior(horseshoe(df = 1, scale_slab=2.5), class = "b", nlpar = "a") +
    set_prior("normal(0,.5)", class = "b", nlpar = "b"),
  family = Beta(link = "logit"), data =data2,
  iter = 4000, save_pars = save_pars(all = TRUE),
  backend = "cmdstanr")

And finally, what is the consequence of , center = TRUE in this case? My first few experiments with this model lead to a large number of divergences. Would setting center = FALSE help with this? Or is there a better way to reduce divergences when working with horseshoe priors aside from increasing adapt_delta?