This is not a good process. The main reason to use PSIS-LOO is because it tells you when its assumptions are not satisfied. If not, then the next course of action is to reconsider the model, not to try another model comparison tool that does not even tell you when its assumptions are not satisfied.
Also, the point about bias when there are many models being compared applies to PSIS-LOO and Bayes factors, as well as WAIC. Comparing models by Bayes factors is completely different than comparing models by PSIS-LOO (or a less robust information criterion), so you should generally decide which route to take before even doing the analysis. PSIS-LOO attempts to answer the question “Which model is expected to best predict new data, given the existing data?” The Bayes factor is largely driven by the pre-data expectation of the likelihood function with respect to the prior distribution, so it is sort of a contest to see which model has the best priors associated with it.
In your case, where you “have a total of 659 candidate models including all possible subsets of predictors”, none of the above approaches really make sense. There is a procedure for this situation outlined in
but the implementation in
currently only works for models estimated by stan_glm
in the rstanarm R package but not for the negative binomial. You could try it with a Poisson likelihood and dummy variables for all the intercept shifts, but you are going to need to use a prior distribution that regularizes heavily.