Colinearity issues arise in multilevel multivariate models (but not in submodels)



I am new to bayesian statistics, stan, and brms - so I have a long way to go, but all this seems very useful for my data, so I am trying to see how far I get.

I am trying to analyze the effect of parents on their offspring, taking into account (co-)dependencies between parents, maternal effects and developmental processes; using phenotypic data of a lab-pedigree. Individual models (lm and gam) I ran before using brm indicate significant correlations between some variables (e.g. Clutchsize_Lengthmom or Pigmentation_Length, however, with low R2). Now my goal is to conduct a path analysis with brm; seems more meaningful to have one model that reflects the biology at works than 5 separate regression analyses.

However, in my brm fit (see below) the credibility interval for these estimates is extremely wide and within zero range and I am not sure what to make of this - I trust my visual inspections of the data, as well as the results I got from the individual models.

Why do my individual regression models detect these effects and the brm fit did not? Is this simply too little data for such a complex model?

And, more philosophically, given that the estimates have the right sign and a meaningful magnitude, just how bad are wide intervals and zero-overlap?


EDIT: I changed the title of this post after realizing the real problem over the course of the discussion.

Model specs (all data was scaled and divided by 2):

path_mod_mates_clutch_juv = brm(
  bf(Lengthmom ~ Lengthdad) +
    bf(Pigmentationdad ~ Lengthdad) + 
    bf(Pigmentationmom ~ Lengthmom + Pigmentationdad) +
    bf(Clutchsize ~ Lengthdad + Lengthmom + Pigmentationdad + Pigmentationmom) +
    bf(Length ~ Clutchsize + Lengthmom + Lengthdad) +
    bf(Pigmentation ~ Length + Clutchsize + Pigmentationmom + Pigmentationdad)
  ,data = mod_data,family=gaussian,
  chains=4, cores=8,seed=17, iter=10000, warmup = 500, thin = 10, inits=0, silent=F,
  prior=prior(normal(0, 1), b), save_all_pars=T,control = list(adapt_delta = 0.99,max_treedepth=20))


Family: MV(gaussian, gaussian, gaussian, gaussian, gaussian, gaussian) 
  Links: mu = identity; sigma = identity
         mu = identity; sigma = identity
         mu = identity; sigma = identity
         mu = identity; sigma = identity
         mu = identity; sigma = identity
         mu = identity; sigma = identity 
Formula: Lengthmom ~ Lengthdad 
         Pigmentationdad ~ Lengthdad 
         Pigmentationmom ~ Lengthmom + Pigmentationdad 
         Clutchsize ~ Lengthdad + Lengthmom + Pigmentationdad + Pigmentationmom 
         Length ~ Clutchsize + Lengthmom + Lengthdad 
         Pigmentation ~ Length + Clutchsize + Pigmentationmom + Pigmentationdad 
   Data: mod_data (Number of observations: 62) 
Samples: 4 chains, each with iter = 10000; warmup = 500; thin = 10;
         total post-warmup samples = 3800

Population-Level Effects: 
                                Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Lengthmom_Intercept                 0.06      0.05    -0.04     0.16       3738 1.00
Pigmentationdad_Intercept           0.18      0.06     0.06     0.30       3940 1.00
Pigmentationmom_Intercept           0.11      0.09    -0.06     0.29       3918 1.00
Clutchsize_Intercept                0.13      0.13    -0.13     0.40       3618 1.00
Length_Intercept                    0.12      0.11    -0.11     0.34       3799 1.00
Pigmentation_Intercept              0.23      0.14    -0.05     0.52       3815 1.00
Lengthmom_Lengthdad                 0.52      0.09     0.33     0.70       3834 1.00
Pigmentationdad_Lengthdad           0.22      0.11    -0.01     0.43       3818 1.00
Pigmentationmom_Lengthmom           0.19      0.23    -0.27     0.63       3442 1.00
Pigmentationmom_Pigmentationdad     0.27      0.39    -0.54     1.02       3509 1.00
Clutchsize_Lengthdad                0.13      0.31    -0.46     0.77       3864 1.00
Clutchsize_Lengthmom                0.34      0.54    -0.75     1.39       3841 1.00
Clutchsize_Pigmentationdad          0.15      0.50    -0.84     1.15       4052 1.00
Clutchsize_Pigmentationmom         -0.05      0.55    -1.10     1.10       3803 1.00
Length_Clutchsize                  -0.34      0.52    -1.30     0.74       3659 1.00
Length_Lengthmom                    0.03      0.56    -1.12     1.15       3707 1.00
Length_Lengthdad                   -0.20      0.31    -0.82     0.40       3751 1.00
Pigmentation_Length                 0.39      0.43    -0.49     1.27       3813 1.00
Pigmentation_Clutchsize            -0.27      0.40    -1.06     0.53       3814 1.00
Pigmentation_Pigmentationmom       -0.30      0.50    -1.30     0.69       3739 1.00
Pigmentation_Pigmentationdad       -0.48      0.45    -1.34     0.44       3769 1.00

Family Specific Parameters: 
                      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma_Lengthmom           0.39      0.04     0.32     0.47       3512 1.00
sigma_Pigmentationdad     0.46      0.04     0.38     0.56       3476 1.00
sigma_Pigmentationmom     0.41      0.07     0.31     0.58       3899 1.00
sigma_Clutchsize          0.52      0.11     0.38     0.81       3993 1.00
sigma_Length              0.54      0.10     0.41     0.80       3622 1.00
sigma_Pigmentation        0.46      0.12     0.31     0.76       3927 1.00

Residual Correlations: 
                                        Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
rescor(Lengthmom,Pigmentationdad)           0.03      0.12    -0.21     0.27       3684 1.00
rescor(Lengthmom,Pigmentationmom)           0.17      0.21    -0.27     0.56       3607 1.00
rescor(Pigmentationdad,Pigmentationmom)     0.09      0.36    -0.59     0.73       3516 1.00
rescor(Lengthmom,Clutchsize)                0.08      0.34    -0.58     0.69       3679 1.00
rescor(Pigmentationdad,Clutchsize)          0.01      0.36    -0.65     0.68       3817 1.00
rescor(Pigmentationmom,Clutchsize)         -0.03      0.34    -0.67     0.62       3757 1.00
rescor(Lengthmom,Length)                    0.03      0.34    -0.61     0.65       3821 1.00
rescor(Pigmentationdad,Length)             -0.19      0.13    -0.43     0.08       3879 1.00
rescor(Pigmentationmom,Length)             -0.21      0.16    -0.50     0.15       3748 1.00
rescor(Clutchsize,Length)                  -0.05      0.32    -0.64     0.56       3784 1.00
rescor(Lengthmom,Pigmentation)              0.09      0.20    -0.33     0.45       3959 1.00
rescor(Pigmentationdad,Pigmentation)        0.09      0.34    -0.56     0.68       3760 1.00
rescor(Pigmentationmom,Pigmentation)        0.04      0.34    -0.62     0.67       3841 1.00
rescor(Clutchsize,Pigmentation)             0.16      0.29    -0.45     0.66       3830 1.00
rescor(Length,Pigmentation)                -0.07      0.33    -0.65     0.59       3717 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).
  • Operating System: Ubuntu 18.04
  • brms Version: 2.7.0


In case of correlating covariates, the marginals can be misleading. See collinear and bodyfat examples in


thanks for sharing these case studies - I’ve tried to digest the information they contain over the weekend, but I have to admit that I didn’t get that far.

the take-home messages I got so far, and please correct me if I’m wrong, to select variables that go into the model more carefully (e.g. with cross validation, loo etc.) and potentially drop some from a final model to narrow down the parameters that really matter.

another thing I realized is, that my variables scatter broadly around a trend, so that my models explain around only 25% of the variation (and twice as much if I add random factors). is that simply too blurry of a signal to be picked up by a multilevel bayesian model?

gam-fit of a some of the data that I think correlate, but return a wide marginal


UPDATE: So, I have been building more simple models that I can compare using model_weights with loo. These more simple models with one or two terms make sense on themselves, and putting some of them back together into the multilevel model works fine. Then, as you said, when I add another (correlating) covariate the marginals again become less informative. Now, that I have identified the problem - is there a way out of it? In the collinear example you say:

“If we are interested just in good predictions we can do continuous model averaging by using suitable priors and by integrating over the posterior. If we are intersted in predcitions, then we don’t first average weights (ie posterior mean), but use all weight values to compute predictions and do the averaging of the predictions. All this is automatic in Bayesian framework.”

Automatic sounds good - so how does this work? Can I really integrate over the posterior, of different models? Or am I getting the wrong idea here…


Great, this was good experiment to do.

What do you want: a) good predictions, b) minimal set of covariates which give good predictions, c) all covariates which contain some predictive information, d) causal interpretation?

If the models are nested, all you need is the encompassing model including all submodels. Depending on the amount covariates, amount of data and sparsity assumption you may want to think more about the prior on coefficients. If the model are not nested you consider making a super including the models as subcases, use Bayesian model averaging or stacking (but check first the encompassing model option).


I think I’d be happy with c) and d) in this case, as I am trying to carefully approximate the biological process in my study. Good predictions are less important for now because of the way these data were collected. The minimal set I think I already have (the nested model with most covariates and still having dense marginals).

Ordered somewhat by importance, I would like to i) be able to show the direction of causality (i.e. the sign of the effect), ii) its size, iii) be able to say which of the (co)variates has bigger predictive power, and iv) which covariates contain at least some predictive information. i) and ii) and possibly iii) are possible with just submodels and their comparison. It would still be important to communicate that I have considered all covariates and also in the context of each other, so iii) and iv) as a an encompassing model would be ideal.

With priors on coefficients, do you mean separate priors for each submodel? How does model averaging over unnested models work?

Just to illustrate, this is the full model, where Lengthdad and Lengthmom are correlated and should have a measurable effect on Clutchsize (not equally big), but they don’t (gray arrows indicate zero overlap of posterior distributions, red negative, green positive estimate, width = effect size).


d) causal interpretation is much more difficult, and recommend reading, e.g. Gelman & Hill (2007) Data Analysis Using Regression and Multilevel/Hierarchical Models.

b) minimal set is smaller than c) all covariates with predictive information which should be equal or smaller set than the nested model with all covariates. I don’t understand what do you mean by dense marginals.

There seem to be confusion of correlation wit causation, and confusion in terminology related to the direction of causality and the sign of effect. I recommend reading, e.g. Gelman & Hill (2007) Data Analysis Using Regression and Multilevel/Hierarchical Models.


I just had a look at the respective chapters, and I definitely used “causal” in a different context: I meant that later, when I interpret and discuss results I would say a caused b because I know b causing a is biologically meaningless. but at this stage I don’t need to explicitly test for causality in my data (although, there is one case where I may need to try causal inference, because I would like to know what the actual biological process is, but I’ll open a separate post for this). thanks for pointing this out.

ok I am still confused about this one. How can I construct an encompassing model containing all variables with predictive information when covariates are correlating?

I hope to better illustrate what I mean with this code example (see problem1 and problem 2) . Thanks for all your help so far, and sorry if I am being unclear.

## separate models with all variables: each gives me good predictions, 
## and tells me which covariates are important: 
brm(y1 ~ a * b)
brm(y2 ~ a * b * y1)
brm(y3 ~ y1 * y2 + a * b)

## using cv_varsel() or model_weigths(), I can construct simple models 
## that still give me good predictions:
brm(y1 ~ b)
brm(y2 ~ y1)
brm(y3 ~ y2)

## from these models I construct the minimal encompassing model (= b)):
brm(bf(y1 ~ b) + 
    bf(y2 ~ y1)
    bf(y3 ~ y2))

## [problem 1] However, now I dropped variables that contain predictive
## information (based on cv_varsel()), e.g. variable a.
## so I go back and include them again, to have my model with
##  all variables containing predictive information (= c)):
brm(bf(y1 ~ a * b)+
    bf(y2 ~ a * y1)+
    bf(y3 ~ y1 * y2 + a))

## [problem 2] after constructing the model with all variables containing 
## predictive information, the marginals give me bad predictions because 
## of correlating covariates


This clarifies the issue as I didn’t understand before this network structure. You use the encompassing term in a different meaning. I meant with it the model which includes all possible smaller models as special cases (e,g. by setting some parameter to 0). I need to think about this, but meanwhile check if this would give you useful ideas.


ok thanks! I will have a look at Gaussian Graphical Models.

I also now realize that the actual problem is that the colinearity in my data is only a problem in the multivariate “network-model” that contains all my submodels, so I changed the title of this thread accordingly.


Your model problem is interesting, so please report back if you figure out something or come up with new questions.


ok will do, but I also would like to check with @paul.buerkner as to whether it can be expected that correlating varibles (cor coef ~ 0.5 - 0.7) do lead to misleading marginals in multilevel brm models, but not in separate unilevel models:

works fine:

brm(y1 ~ a * b)
brm(y2 ~ a * b * y1)
brm(y3 ~ y1 * y2 + a)

marginals are vastly spread:

    bf(y1 ~ a * b)+
    bf(y2 ~ a * b * y1)+
    bf(y3 ~ y1 * y2 + a))


Using bf(y3 ~ y1 * y2 + a) is poses a problem since you estimate the correlation twice (more or less), (a) via the regression coefficients of y1 and y2 and (b) via the residual correlations. If you want to turn of the latter add set_rescor(FALSE).


this solved it - thanks! and thanks @avehtari for getting me to think more about collinearity, this was a valuable lesson.

in case anybody finds this, the resources that helped most were: