Include extra coefficient multiplied with regression weight ("inclusion weight" à la Kruschke)

Hi all,

I try to recreate an example from Kruschke’s DBDA in brms, which incorporates a so called “inclusion weight” for a regression coefficient.
This is a multiplicative coefficient that is sampled from a Bernoulli. When it’s 0, the coefficient is exculded, when it’s 1, it is included.

Here’s the accompanying JAGS code; the relevant part is highlighted:

model {
for ( i in 1:Ntotal ) {
  zy[i] ~ dt( zbeta0                
              + sum( delta[1:Nx]      # delta is inclusion indicator <---THIS IS THE RELEVANT PART
                     * zbeta[1:Nx]    # "normal" regression coefficient
                     * zx[i,1:Nx] ) , # standardized predictor
              1/zsigma^2 , nu )
}
# Priors vague on standardized scale:
zbeta0 ~ dnorm( 0 , 1/2^2 )           
for ( j in 1:Nx ) {
  zbeta[j] ~ dnorm( 0 , 1/2^2 )       
  delta[j] ~ dbern( 0.5 )             
}

Is it possible to build such a model in brms?
(Or maybe this is a more basic question: Is it possible to build the model in Stan?)

It is also described in this blog post: http://doingbayesiandataanalysis.blogspot.com/2014/01/bayesian-variable-selection-in-multiple.html

Thanks, Felix

1 Like

In Stan you can probably code this up using log_sum_exp to marginalise over the possible values of the latent Bernoulli r.v. See Chapter 15 in the Stan manual.

The problem is that Stan cannot handle discrete paramters. Instead, you have to marginalize over them as explained in the Stan manual.

1 Like

OK, thanks. Is that something that could be done in brms, or do I have to use base Stan for that?

You could try to use brms’ non-linear syntax, but even if it worked, it would likely feel a little awkward. You can just extract the Stan code from brms via make_stancode, amend it and put it into Stan.

Thanks. I will try that and post the code here if I succeed …

This is equivalent to spike-and-slab model with point mass spike and Gaussian slab. If you have only a few variables you can marginalize over the discrete parameter as @paul.buerkner suggested, but with increasing number of variables the combinatorial explosion makes it infeasible. You can use regularized horseshoe prior as a continuous alternative with the benefit of improved mixing as gradients are available. See more in Sparsity information and regularization in the horseshoe and other shrinkage priors.

Hey @felix.schoenbrodt, did you ever succeed at this? My solution was to give up on Kruschke’s approach, fit multiple models with different predictors, and compare them with LOO and model weights.

1 Like