I believe that your spike and slab code isn’t implemented correctly as is. If you go to the top right of page 882 of http://www.snn.ru.nl/~bertk/machinelearning/2290777.pdf (George and McCulloch 1993), they describe the induced prior on the regression coefficients as a mixture of multivariate normals. What you’ve implemented is a mixture of regression models (actually not exactly sure what the right description is here). I think in order to implement it correctly you need to implement the mixture at the level of the regression coefficients. So if there are p covariates and M possible sparsity patterns you need M*p parameters representing the regression coefficients for each sparsity pattern, the marginalize over the sparsity patterns to induce a prior on p regression coefficients.