Sorry, I might not have expressed myself properly…
I have run such models in brms, and for some of them I have up to 100-200 binary \beta_{categorical} parameters.
I have done the test for practical equivalence using the package bayestestR to check whether my parameter values should be accepted or rejected against the null hypothesis (whether or not the HDI region of the posterior distribution of my parameters falls within a ROPE region).
I got warnings about possible multicollinearity between some of my parameters. And the problem is that then I should not trust the results because multicollinearity may shift the distributions towards or away from the ROPE.
How can I estimate properly if there is inflation in my models due to these correlations? Which threshold between pairwise correlations should I consider as intolerable (i.e. > 0.9)?
Should I reconsider my model design? And perform a univariate analysis instead?
For correlated predictors I recommend projpred. We have a pull request which brings support for categorical variables and “random” effects, so if you are in a hurry and brave you can test it right now or wait for a moment and look for the announcement when it’s merged. projpred works very well with correlated predictors, although it answers slightly different question “What is the minimal set of predictors providing the same predictive performance as the full model?”. You can find case studies and videos of multicollinearity and projepred at https://avehtari.github.io/modelselection/. If you need to find all predictors with some predictive information, then univariate approaches seem to be good choice, and we’ll soon have a paper out with more recommendations.
Thank you @avehtari. The link to your resources are really useful, comprehensive and interesting for learning more in this line.
However, as you say projpred seems to answer slightly different my question and fits very well from a prediction point of view. Instead I am more interested in evaluating the effect sizes that all the regressors have in my outcome of interest.
I guess that a univariate approach will do, and/or a multivariate approach assuming some degree of inflation in variables showing correlations.