Projpred predicted model with horseshoe prior - hypothesis test? Tutorial?

Hi all,

I was wondering which prior I need to use in order to perform inferential tests about the coefficients of a projected model after selecting variables with projpred and a horseshoe prior? (Is there a tutorial that I missed?) I’m working with the horseshoe prior for the first time, so please excuse me if this is obvious for you more experienced folks.

Sample Code
Say we perform a variable selection as per the tutorial here: https://htmlpreview.github.io/?https://github.com/stan-dev/projpred/blob/master/vignettes/quickstart.html

… and as per the tutorial, we do the projection like so:

proj <- project(vs, nterms = 3, ns = 500)

Given the projection, I would like to compute relative evidence/Bayes factors about, for instance, the coefficient x1 in the projected model being greater than zero. How to specify the prior?

m <- as.matrix(proj)
# I can specify a normal prior easily
# my question: what is the correct prior to use here
# given the horseshoe prior that was used in the variable selection?
prior <- distribution_normal(nrow(m), mean = 0, sd = .1)

# Once we go the prior right, we can just test a hypothesis
bayesfactor_parameters(m[, "x1"], prior, direction = ">")

I just cannot figure out how to draw from the horseshoe, basically. I feel like this should be simple :p

Thank you for your time!

1 Like

There are several videos and cases studies on the topic at https://avehtari.github.io/modelselection/
Start with Use of reference models in variable selection at Laplace’s demon seminar series.

Relevant papers are



Given the projection you don’t want to compute relative evidence/Bayes factors. In the projection the horseshoe prior is also projected and there is no need to specify priors for the submodels.

In the submodel some coefficients are fixed to zero as a decision, and thus for those there is no probability. The rest of the coefficients are continuous valued and thus the probability that they are zero is 0. The projection predictive approach tells how good the model is if you fix some of the coefficients to zero and how the optimal inference given the non-zero coefficients is done.

1 Like

Hi Avehtari!

Interesting answer, thanks!

Hm. Maybe my question wasn’t clear: how to draw from this very horseshoe prior to compare it to the projected posterior of the coefficients? (I mentioned in my post that the specified normal prior is not correct but maybe this was not clear enough.)

The question was the evidence for greater 0. But nevermind.

More important: You seem to suggest that inferences via BFs about the size of the included coefficients are never relevant after projection. True?

Cheers
Jana

Sorry. For non-zero, you can compute the probability that the coefficient is larger than 0, but after projection it’s answering a question something like: If I fix some coefficients to 0 (e.g. because I don’t want to measure them in the future and want to drop them out of the model), then how do I make the optimal inference (optimal in sense of predictive performance) and for that optimal inference how much probability mass I should put for positive values of this coefficient. So it’s not necessarily what you were thinking in the beginning.

BF and the projection predictive approach are both answers to certain decision analysis questions. After we’ve selected which question to answer the decision theory tells which approach provides the optimal answer to that question. See the specific questions and explanation of answers in A survey of Bayesian predictive methods for model assessment, selection and comparison (see specifically 3.1.2 Expected utility and optimal decisions, 3.3.2. Projection predictive model selection: M_*-projected prediction, and 3.4.3. Zero-one utility on the model space) . So I think it doesn’t make change to answer first the one question and then say that actually I wanted to answer that another one, or at least the interpretation of BF after the projection is very different.

Related to BF, you might also enjoy reading When are Bayesian model probabilities overconfident? and Using stacking to average Bayesian predictive distributions.

great, thanks I will check the literature!

1 Like