Comments on the workflow for model and variable selection for BYM2 modelling

Hi there,

I would love to have some comments regarding the workflow of BYM2 modelling that involves variable selection (25 variables) and model selection.

I attempted to do variable selection in BYM2 model (poisson) using brms, but I couldn’t make it work and wonder whether it’s possible and how?

Therefore, my approach is

  1. Conduct variable selection in a poisson model in brms using regularized horseshoe prior and projpred package without including the spatial element.
  2. Include the selected variables in BYM2 models with different family types (i.e., poisson, negative binomial, zero inflated poisson and zero inflated negative binomial) and use loo for the model selection. However, I also encounter difficulties in reloo for the BYM2 models.
  3. Examine the fit of the selected model and do the posterior check.

Thank you in advance :).
Cheers
Kendy

1 Like

I’m very much interested in the same problem. Is there a reason why you can’t do projpred with BYM2?

  1. projpred currently has very narrow supports (GLMs) and a BYM2 model falls outside of its scope. This may change at some point in the future though.
  2. This approach could be sensible but I don’t know what “I also encounter difficulties in reloo for the BYM2 models.” means without further details.
  3. Definitely helpful as a supporting method for (2) I would say.

Thank you Paul for the reply.

So this is the error that I’ve got when I did the reloo
Error: Dimensions of ‘W’ must be equal to the number of observations.

I did check the dimensions of W and its 43x43, so did the number of observations. It runs ok when I specify reloo= F apart from some models have quite a few observations with areto_k > 0.7…

Cheers

Can you provide a minimal reproducible example for me to try out?

Yes.

Here I attached the dataset and the neighbourhood matrix. In the dataset,
obs: outcome
exp: log(exp) is the offset
the other two columns are covariates

Thanks a lot!

data.csv (2.0 KB)
neighbourhood.csv (3.8 KB)

Thanks. Can you also send me the code that creates the error? Basically, and ideally, a minimal reproducible example is something that I can just copy and paste into my editor and then it shows the problem you describe.

Nevermind, I found the problem. I should now work in the github version of brms, at least it will provide the error message if it cannot do exact LOO for CAR models for statistical reasons if you try to make predictions for new locations.

My codes are as follows:

test <- brm(obs ~ ind_intmx_recriaD + ind_intmx_cerdasD + offset(log(exp)), 
            data = D.paul,
            family = poisson(), autocor = cor_car(W, type = "bym2"),
            chains = 1, iter = 1000,
            save_all_pars = T, seed = 19871002,
            control = list(max_treedepth = 12)) 
standata(test)
test1 <- update(test, family = zero_inflated_poisson())
loo(test, test1, reloo = F)

So to use the codes in Github I just need to read the updated functions in Rstudio, is that correct? Because I must have done something wrong. R keeps asking me to debug…

I am not sure what you mean. Please see the README at https://github.com/paul-buerkner/brms for details how to install brms from github.

Thanks a lot!!