Bayesian kernel machine regression in Stan


I am trying to model multipollutant effects of cancer incidences on a county basis with a Poisson regression model that also accounts for spatial correlation. I was planning to do this with a Bayesian kernel machine regression (BKMR) to account for non-linear and multiplicative effects in the pollutants. Is there an implementation of the Bayesian kernel machine regression in Stan?

I know that a Gaussian process is a specific form of a Bayesian kernel machine regression and that it is implemented in brms(). I have seen an example where the gp() function is used to model spatial correlation. Can it also be used to model the correlations in the pollutants as with the BKMR and are there any examples for this?

Hi, @Theresa_U: I haven’t heard of Bayesian kernel machine regression, and as far as I know, there’s no discussion of them in our doc or case studies. We do have a lot of doc in Stan around Gaussian processes and also several tutorials outside of our doc.

If all you need to do is introduce some spatial smoothing you can use a much more efficient ICAR model. Here’s a case study with Poisson observations:

It’s not nearly as general as a GP, but it scales better. For GPs, you can start with our user’s guide:

and then check out a bunch of the case studies:

Thank you, @Bob_Carpenter !

I will take a look at the literature and examples and see if I can adapt them to my case.

The BKMR also adds a group-wise variable selection to the model and models complex interactions. In my dataset I want to model the relationships of many highly correlated variables and also do a group-wise variable selection in which I identify groups of relevant variables for the outcome. Which approach would you suggest for this?
I read that Projection predictive feature selection is generally a good method for variable selection but does it also work for many correlated variables and would you include all two-way interactions as a starting point?