Compare influence of covariates in model

tester · March 15, 2021, 1:54pm

I have fitted a Bayesian (ordered logit) regression model using brms, with a set of covariates (with different scales/units), say:

formula = y ~ A + B

I would now like to infer some sort of (qualitative) importance/influence of the covariate on the model. I saw that it is possible to standardize covariates directly in the brms formula:

A_s = scale(A)
B_s = scale(B)

…

formula = y ~ A_s + B_s

Is it a sound approach to standardize all covariates and then compare their estimated parameter distributions? For instance, if the parameter estimate (e.g., mean) of the standardized covariate A is larger than B, infer that A seems to be more “important” than B.
Or are there preferable ways, e.g., using posterior predictions/proportion of variance explained?

Operating System: Windows 10
brms Version: 2.14.4

torkar · March 15, 2021, 2:30pm

Generally speaking, I think it’s worthwhile to scale your predictors (if they are not categorical etc. of course…)

If, when you sample the model, you will see that A is further away from 0 than, say, B, that only tells you indirectly how large an effect A has compared to B (the effect can be negative or positive). It is also indicative of how much predictive power A has compared to B (see the library projpred for examples).

Did that answer your question? If so, then a good start is to sample the model and show the parameter estimates using mcmc_areas() in the bayesplot library.

tester · March 16, 2021, 10:09am

Thanks for the fast response and suggestions. I will check out the projpred library.

To clarify, is this because of the location shift when standardizing, i.e., subtracting by the mean? Is there a way to preserve/extract the information about the effect direction?

I wondered if there is some more advanced method, for instance, using posterior predictions, that can then also tell whether A and B have a positive or negative effect. Standardizing the data by means of subtracting the mean and dividing by the standard deviation seemed to me like a perhaps too simple way, given that A and B might not be normally distributed. It also seems more intuitive to me to interpret effects on the response rather than the predictor scale.

Btw., I saw a related topic, and it seems that variance explained might not be a better alternative.

torkar · March 16, 2021, 10:18am

What I mean is that the model might estimate an effect (a parameter) to be negative or positive. You will immediately see the direction when plotting or looking at the summary of a model.

Yes, what you should do after looking at plots etc. is to look at the size of the effect and how it varies depending on setting values on other predictors. One simple way to get a rough feeling is to plot the conditional effects.

tester · March 16, 2021, 5:02pm

Okay, so the direction of the effects for a model based on standardized data would be the same as when fitting with non-standardized data, just the magnitude changes? I might have misunderstood this before.

Yes, using conditional_effects gives a nice understanding of the effect of a predictor on the outcome. But my understanding is that this is under the condition that the other predictors are kept at their mean values. Are you aware of some kind of measure, like a single value, that can be calculated that expresses what one can “see” (visually) in the plots from conditional_effects?

torkar · March 16, 2021, 5:10pm

Yes, it’s a standardized measure, but you can always get it back to its unstandardized format.

Indeed, their mean values or the first level for factors.

Well, you can ask any questions you want and the posterior will provide a reply (sometimes the reply is gibberish due to asking the wrong question ;) If you do posterior_predict(), setting the covariates to their means (and factors to their first level) you will basically get what conditional_effects() will give you.

I think this post by @jonah summarize things in a very nice way:

tester · March 16, 2021, 5:54pm

All clear, thanks for the clarifications.

Extending the idea of comparison on the predictor scale, is it possible in a similar manner to compare a categorical with a continuous (standardized) predictor, e.g., if I have a C in the model that is binary (true/false)? Of course, the categorical predictor would not be standardized as you mentioned earlier.

torkar · March 16, 2021, 6:16pm

Aah, yes, that is very easy. You give posterior_predict(..., newdata=df) a new data.frame with what you try to predict, e.g.,

df <- data.frame(
A = 0, # if you have scaled the variable
B = 0, # same here
C=c(1,2) # i.e., your two levels (notice you should avoid using 0/1 in order to set sane priors)
)

and then posterior_predict will spit out a large data frame where you can compare the output when C=1 and C=2 :)

tester · March 16, 2021, 6:41pm

Thanks for the example on using posterior_predict.

What if I would like to make a comparison of the predictors on the parameter level? Can I, as suggested earlier, simply look at the plot from, for instance, mcmc_areas and compare how far from zero A, B, and C are?

torkar · March 16, 2021, 6:55pm

Sure you can do that and you’ll get a good feeling for how it looks like. In the end querying the posterior will give you more since you will be able to contrast differences etc.

Good luck now and come back once you have sampled you model! :)

tester · March 17, 2021, 10:48am

The sampling is complete. ;-)

One last thing I would like to ask regarding this topic is whether I could even compare standardized predictors across data sets. Say when I have fitted two models with the same formula (= A + B + C), but on two different data sets. Both data sets were separately standardized. I could then compare the mcmc_areas from both models in the same plot. Is that a sound method, too?

torkar · March 17, 2021, 10:52am

Yes you could, but why not make the two datasets part of a multilevel model? In short, they will be part of the same model, e.g.,

y ~ A + B + C + (1 | D)

Where D is an indicator of dataset (i.e., 1 or 2), which will affect the estimation of the intercept, e.g., This is only an example, and the actual formula depends on what question you want to answer.

tester · March 17, 2021, 1:33pm

Nice idea, thanks.

I missed to add, in my case the responses are on two slightly different scales (one item-response scale goes from 1 to 5, the other from 1 to 6), while the predictors are the same. Does that inhibit me from being able to compare the predictors across data sets?

torkar · March 17, 2021, 1:59pm

Aha, you’ll model the outcome as cumulative, adjacent-category or stopping ratio? Well the same thing applies basically :)

tester · March 17, 2021, 4:26pm

Cumulative, with the logit link function.

But in this case, I cannot combine both data sets to make a multi-level model (with 1|D), right? So I wonder if I could still compare the estimated parameters of the standardized predictors while fitting both models separately. They would only have different cut-point estimates due to the different scales.

torkar · March 17, 2021, 4:30pm

Why wouldn’t you be able to combine the datasets? Is the outcome the same? Are the predictors the same? Then one dataset where you have a column indicating (1 or 2), if it is dataset 1 or 2, should do?

tester · March 17, 2021, 4:39pm

One data set has possible outcomes [1, 2, 3, 4, 5, 6], and the other one [1, 2, 3, 4, 5]. That is why I thought it is not possible to combine them.

torkar · March 17, 2021, 4:41pm

No, if you have different outcomes then it’s not possible. BUT, the other dataset [1-5], is it because no-one answered 6, or could they only answer 1-5?

tester · March 17, 2021, 5:05pm

All right.

They could only answer 1-5.

torkar · March 17, 2021, 5:06pm

Aaah, then I think your toast :(

Topic		Replies	Views
Undo standardization after fit with brms brms	4	634	April 1, 2025
Brms automatically scaling response variables in a multivariate model? brms fitting-issues , brms	5	816	October 6, 2022
Interpreting beta estimates from ordinal brms brms	7	375	August 15, 2024
Scale Continuous Predictors General brms	6	1879	September 10, 2021
Results from monotonic effect versus dummy variables in brms Modeling	27	1533	March 14, 2020

Compare influence of covariates in model

Related topics