Calcualte VIF?

lero · February 27, 2020, 10:07am

Hi! I am quite new to Bayesian statistics and I am having quite a hard time for understanding how to deal with collinearity between parameters.

Is there a way I can calculate collinearity between my regressors as you would with VIF or an equivalence test?

I have a model with random effects and multiple categorical regressors with two levels each (0/1):

Outcome ~ Categorical1 + Categorical2 + Categorical3 + (1|person)

Is there a way I can get around this in a Bayesian setup?
And any advice on how to solve multicollinearity?

Very much appreciated your help in advance.

maxbiostat · February 27, 2020, 11:54am

I’m gonna go ahead and tag @avehtari and @andrewgelman on this, as they can provide much more complete answers than I could.

avehtari · February 27, 2020, 4:40pm

What is VIF?

Max_Mantei · February 27, 2020, 4:54pm

I guess it’s the Variance inflation factor.

lero · February 27, 2020, 6:26pm

Yes, VIF as Variance Inflation Factor.

andrewgelman · February 28, 2020, 12:27am

You should be able to just fit this model directly in rstanarm or brms.

lero · February 28, 2020, 8:16am

Sorry, I might not have expressed myself properly…
I have run such models in brms, and for some of them I have up to 100-200 binary \beta_{categorical} parameters.

I have done the test for practical equivalence using the package bayestestR to check whether my parameter values should be accepted or rejected against the null hypothesis (whether or not the HDI region of the posterior distribution of my parameters falls within a ROPE region).

I got warnings about possible multicollinearity between some of my parameters. And the problem is that then I should not trust the results because multicollinearity may shift the distributions towards or away from the ROPE.

How can I estimate properly if there is inflation in my models due to these correlations? Which threshold between pairwise correlations should I consider as intolerable (i.e. > 0.9)?

Should I reconsider my model design? And perform a univariate analysis instead?

avehtari · February 28, 2020, 10:29am

For correlated predictors I recommend projpred. We have a pull request which brings support for categorical variables and “random” effects, so if you are in a hurry and brave you can test it right now or wait for a moment and look for the announcement when it’s merged. projpred works very well with correlated predictors, although it answers slightly different question “What is the minimal set of predictors providing the same predictive performance as the full model?”. You can find case studies and videos of multicollinearity and projepred at https://avehtari.github.io/modelselection/. If you need to find all predictors with some predictive information, then univariate approaches seem to be good choice, and we’ll soon have a paper out with more recommendations.

lero · March 2, 2020, 7:55am

Thank you @avehtari. The link to your resources are really useful, comprehensive and interesting for learning more in this line.
However, as you say projpred seems to answer slightly different my question and fits very well from a prediction point of view. Instead I am more interested in evaluating the effect sizes that all the regressors have in my outcome of interest.
I guess that a univariate approach will do, and/or a multivariate approach assuming some degree of inflation in variables showing correlations.

Topic		Replies	Views
Collinearity and Bayesian modeling Modeling	1	1093	May 10, 2019
Best way to model time series with autocorelation and parameters colinearlity? Modeling specification , brms	0	188	April 4, 2024
Bayesian logistics regression, solving dependency problems Modeling techniques	3	577	September 26, 2022
Collinearity Modeling	5	1116	February 25, 2018
Correlated predictor variables in brms Modeling specification , brms	4	1388	April 11, 2022

Calcualte VIF?

Related topics