Help understanding model weights in brms

jonasdora · January 11, 2021, 9:01pm

Hello all,

I am currently performing an exploratory analysis on an EMA dataset.
I planned to compare several models containing combinations of several theoretically motivated predictors.

All predictors are continuous on a 100-point scale; I standardized them and am using the same prior for all of them. I made sure that each model is predicting the same values of my DV.

Using the model_weights() command using both waic and loo, there seems to be one clear ‘winning’ model that gets assigned >99% of the weight. One thing that confuses me is that the two predictors in this model (judging by the estimate + CI) appear to be much weaker predictors of my DV than other variables that are present in models that get assigned virtually no weight. Does anybody have any advice how I could try to better understand why the model including these two predictors seems to have superior out-of-sample prediction compared to models similar in complexity but with what appears to be stronger predictors?

Thanks in advance,
Jonas

martinmodrak · January 18, 2021, 3:12pm

Hi,
this is hard to judge very well without seeing the what models you used and what exactly are the results. Several possible causes (beyond a bug in your code, which is unfortunately always an option) I can imagine:

The “strength” of the predictors you deem stronger is unwarranted and driven by a small set of influential observations. loo is correctly penalizing the models
There is a strong correlation between the model coefficients of the two predictors, i.e. the coefficients are not well informed separately (hence a wide CI), but their combination - and hence model prediction - is actually very tightly constrained. This could be checked with the mcmc_scatter mcmc_pairs function.
While we can learn a lot about the sign/CI of the other predictors from the data, they actually explain only small part of the total variance, while the predictors in the highest-ranked model are less well informed, but explain a lot of the variance. This should be visible by comparing the residuals and/or posterior for the sigma parameter.

It can also be a combination of the above or something completely different I didn’t think of - the possibility space is IMHO quite large…

Best of luck with your model!

jonasdora · January 18, 2021, 5:20pm

Thank you for your reply, Martin!

I will try to look into these pointers and try to learn more about them, so that I can better understand my modeling in the future.

Jonas

avehtari · January 18, 2021, 7:43pm

Although you marked this as solved, I’ll add that see colinearity examples at Model assesment, selection and inference after selection | avehtari.github.io. For example all these examples show colinear covariates that cause the marginal posterior (estimate + CI) to overlap zero.

collinear
diabetes
mesquite
bodyfat
In case of colinearity the marginal posteriors are not reliable to assess whether some covariate has useful predictive information. On tha web site there also links to videos explaining these issues.

Also if you are comparing many models due to the testing all combinations of covariates, the I recommend to use projpred instead of computing LOO and model weights for each model.

Also there is no need to compute both waic and loo, as they estimate the same thing, but waic is more fragile. See more in links at CV-FAQ

Topic		Replies	Views
Interpreting varying and unstable model comparison results brms loo , interpret-results	2	768	August 9, 2018
Loo_model_weights() returning inconsistent result General rstan , loo , r , brms	4	573	May 20, 2021
Model misspecified or only weakly predictive? Modeling specification , loo , brms	23	1221	May 24, 2023
Loo, model_weights and brms models with sample_prior = "only" Modeling loo , brms	3	528	December 11, 2020
Model stacking and LOO (brms models) brms loo	11	7124	June 29, 2018

Help understanding model weights in brms

Related topics