Interpretation of cor term from multivariate animal models

Hi all,

I’m using brms to fit animal models estimating heritability and genetic correlations of different traits in a wild mouse species. It’s my first foray into both animal models and Bayesian modeling, so my question is extremely simple, but one I need to ask to make sure I’m understanding how to interpret my results: Can I directly interpret the ‘cor’ parameter from multivariate models as the genetic correlation between two traits?

For models fitted using MCMCglmm, one must calculate genetic correlations from covariances; thus there is post-processing of results. There isn’t a way to do this in brms, but the output results from multivariate models do provide a “cor” term between the intercepts of the two traits. This topic was mentioned, and responded to, in a post from a few years ago, but I’m not 100% sure I’m interpreting this correctly!msg/brms-users/Edc_dA7uWxM/JebX7Xh6AQAJ. However, after reading that and then reading Paul’s recent tutorial on multivariate modeling, I believe I can interpret this term as I have described above–but again, I definitely want to make sure I’ve well understood this.

For example, in Paul’s vignette, tarsus and back are estimated to have a cor of -0.53:

Group-Level Effects: 
~dam (Number of levels: 106) 
                                     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
sd(tarsus_Intercept)                     0.48      0.05     0.40     0.58 1.00     1011
sd(back_Intercept)                       0.25      0.08     0.10     0.40 1.01      301
cor(tarsus_Intercept,back_Intercept)    -0.53      0.22    -0.94    -0.09 1.01      457
sd(tarsus_Intercept)                     1276
sd(back_Intercept)                        564
cor(tarsus_Intercept,back_Intercept)      405

Does this mean that the genetic correlation is = -0.53 (i.e. no further calculations are necessary, unlike in MCMCglmm)?

Much thanks in advance,


The cor terms indeed give you correlation between the effects associated with different levels of the factor without the need for further post-processing. Whether you can interpret this correlation as the genetic correlation is a much more tricky question and if you want us to help you figure it out, we would need more details: what exactly was the experiment design? How does the full formula look like? Though the problem of interpretation is almost certainly the same whether you use MCMCglmm or brms.

Best of luck with your model!


Thank you so much for your quick reply, and yes, I would very much appreciate the additional help of understanding genetic correlations.

I have a population of wild singing mice for which I have taken a variety of morphometric and behavioral measurements. Sample sizes for each trait vary, but are typically around 100. I also genotyped animals to infer pairwise genome-wide relatedness. One basic question I am asking with these data is if there are genetic correlations between pairs of morphometric and behavioral traits–that is, do certain aspects of body condition predict characteristics of mouse song?

For example, does body size (response trait 1) predict song length (response trait 2)? I am specifying sex and trapping site (N = 5) as fixed covariates in the model, which looks like this:

fit <- brm(
  mvbind(Resp1, Resp2) ~ Site + Sex + (1|p|gr(ID, cov = A)),   
  data=mice,                        # phenotypic data
  data2 = list(A = covmat),   # genetic covariance matrix
  family = skew_normal(),
  control = list(adapt_delta = 0.9999999999, stepsize = 0.001,  max_treedepth = 40),
  warmup = 1000, iter = 5000,
  chains = 4, cores = 1,thin=1)

And the output looks like this:

 Family: MV(skew_normal, skew_normal) 
  Links: mu = identity; sigma = identity; alpha = identity
         mu = identity; sigma = identity; alpha = identity 
Formula: Resp1 ~ Site + Sex + (1 | p | gr(ID, cov = A)) 
         Resp2 ~ Site + Sex + (1 | p | gr(ID, cov = A)) 
   Data: mice (Number of observations: 100) 
Samples: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
         total post-warmup samples = 16000

Group-Level Effects: 
~ID (Number of levels: 100) 
                                     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Resp1_Intercept)                      0.18      0.12     0.01     0.44 1.00     3371     6218
sd(Resp2_Intercept)                      0.56      0.29     0.03     1.07 1.02      234      271
cor(Resp1_Intercept,Resp2_Intercept)     0.07      0.53    -0.92     0.95 1.00     1575     3371

Population-Level Effects: 
                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Resp1_Intercept    -0.56      0.20    -0.95    -0.14 1.00    20661    12457
Resp2_Intercept    -0.03      0.25    -0.53     0.46 1.00    16102    11927
Resp1_SiteMH        0.04      0.22    -0.36     0.49 1.00    18503    11470
Resp1_SitePC       -0.05      0.96    -1.51     2.19 1.00    19064    10293
Resp1_SitePH        0.36      0.41    -0.38     1.24 1.00    17898    13069
Resp1_SexM          0.61      0.21     0.16     1.00 1.00    21675    12054
Resp2_SiteMH       -0.40      0.36    -1.12     0.32 1.00    11391     9665
Resp2_SitePC       -1.12      1.09    -3.19     1.05 1.00    15034    12324
Resp2_SitePH        0.55      0.55    -0.50     1.64 1.00    16617    12890
Resp2_SexM          0.18      0.28    -0.36     0.74 1.00    13215    11074

Family Specific Parameters: 
            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_Resp1     0.94      0.08     0.79     1.11 1.00     8919     9139
sigma_Resp2     0.82      0.22     0.29     1.14 1.02      241      180
alpha_Resp1    -5.56      1.84   -10.00    -2.81 1.00     9967     8847
alpha_Resp2    -0.81      2.69    -6.89     4.95 1.00     4015     2424

Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Warning message:
There were 21 divergent transitions after warmup. Increasing adapt_delta above 0.9999999999 may help. See 

(One separate issue is that for some of these models, there are divergent transitions (20-100) despite the univariate models for these traits producing no divergences (even with adapt value as high as it is and with skew normal to better accommodate the skewness of the data).)

Again, much thanks for the help, and please let me know if I have neglected to include important information.

1 Like

This is worrisome and it is quite dangerous to interpret the results without figuring out what is the original cause - a likely candidate is that either the data not containing enough information to inform the model or that you need a bit more relevant priorse for your parameters.

  1. Do the problems persist if you put narrow priors on everything (check get_prior for more info on priors in brms)?
  2. Can you fit the model without divergences (and without extreme adapt_delta and max_treedepth) if you omit one of the terms? Or taking only a single response and avoiding modelling correlations (i.e. using 1||gr(ID, cov = A))? This might

Back to the main question: I’ll admit I have only very shallow understanding of those phylogenetics models, but I’d like to take this as an opportunity to understand them better, hope that ends up also helping you :-) But please check if what I say makes sense. I’ll also ask @maxbiostat if he has time to check my reasoning.

My starting point is what if there was only non-genetic correlation (for example that mice that ate more are bigger mice and have larger stamina and can thus sing longer) while both traits have genetic background. What would the (1|p|gr(ID, cov = A)) term do? The varying intercepts for both Resp1 and Resp2 are constrained to covary with the genetic similarity, since the correlation we have is independent of genotype, this will tend to push the estimated correlation down as the non-genetic variations will go against the covariance in A. However, there is correlation in the responses which will push the estimated correlation up. I think that if the information in A is weak, there is low sample size and the non-genetic correlation is large, you might get quite high correlation estimates even if there is actually no genetic correlation.

I am not sure you can easily disentangle these two sources of correlation, if you had a lot of data, you might be able to add an additional (1|p|ID) term that would model the non-genetic correlations, but I don’t think these two terms would be both informed by data in most cases… Additionally, you can have domain knowledge indicating that the non-genetic correlation should be small, or try to determine to what extent does the enforced phenotypic covariance suppress non-genetic (i.e. independent of phylogeny) correlations (this could probably be done with a simulation study).

A further pedantic note: I think estimating the correlation and determining if “body condition predict characteristics of mouse song” are two slightly different questions: If you tried to predict mouse song with body condition (i.e. with Resp1 ~ Site + Sex + Resp2 + (1|gr(ID, cov = A)) you could IMHO get different conclusions than from examining the correlation in your model. One example I can think of is if Resp2 was not uniformly distributed across sites and sex where Resp2 could turn out to be the strongest predictor and it could render Site and Sex as weak predictors, which is less likely to happen in the formulation you’ve used. (once again, this is just my guess so please double check). Also, unless “genetic correlation” has some unintuitive technical meaning I missed, the way one trait predicts another can be totally non-genetic - maybe I am just confused about something basic here? Also not sure this is very relevant to your inquiry.

Does that make sense to you?

BTW: cool dataset, I’ve never heard of mouse song before and I am amazed such a phenomena exists!



Unfortunately, I’m short on both time and knowledge about this sort of model. My phylogenetics expertise relates to actually building them (phylogenies), not necessarily to comparative methods. People like @dpascall, @diogro and @Ax3man might be able to help.


@martinmodrak Thank you mille fois for your kind and thoughtful response. Your time is much appreciated!

I will address all your points below to the best of my abilities:

  1. Yes, these problems are persisting with different prior specifications.
  1. By omitting one of the terms, do you mean one of the fixed covariates (i.e. Site or Sex)? If this is the case, omitting these covariates did not mitigate the divergences.
    Re: the second question, do you mean “a single response” as in a univariate model (see below)? If so, both Resp1 and Resp2 univariate models have no divergent transitions. Both are specified with the same priors and skew normal families.
# univariate example
fit <- brm(
  Resp1 ~ Sex + Site + (1|gr(ID,cov=A)),
  family = skew_normal(),

Regarding my originally stated research question(s):

I realized today that I had stated it rather inaccurately. As you correctly pointed out in your reply, the question of “does body condition predict mouse song?” is indeed different from “what is the proportion of covariance between the traits explained by genetics?”. The former I have modeled sans genetic covariance matrices (basic GLMMs); after identifying those pairs of phenotypes with significant relationships, I next wanted to ask whether these phenotypically correlated traits also share genetic correlation, i.e. the latter question. This is what I should have written in my original post. I hope I am now expressing myself less ambiguously.

I’m afraid I do not completely understand this part of your response. Does this mean that despite modeling genetic correlations between the two traits with the (1|p|ID) term, existing phenotypic correlations (i.e. non-genetic correlations) can muddle the results, and I will therefore not be able to discriminate the origins of my estimates? I would like to include this estimation because it is interesting biologically, but it sounds like it may be difficult–or impossible–with sparse data?

Also: what does it mean exactly for information in A to be weak? Would an example be a pedigree that is not highly connected? How might intermediate or weak non-genetic correlations influence these estimates?

Yes, I agree this is different from a genetic correlation (and not what I am trying to ask but indeed how i originally posed my question!).

Thanks! They are pretty cool animals. All mice vocalize, but most do it at frequencies inaudible to human ears. Ours (Scotinomys teguina) make audible trilling songs, sort of like cricket trills.

And thanks again for such a thorough response and your time thus far!

@maxbiostat thanks as well for passing along this post!


Hi Tracy.

The correlation in the random effects with the pedigree is indeed the genetic correlation. But I never had any luck fitting these models in brms, and have relied on asreml, MCMCglmm (tutorial here) and the custom stan scripts you can find here. This last one is pretty experimental, but I would be glad to help you if you want to try to use it.

One useful thing might be to use simulated data to check if your model is behaving properly. I do this here, including residual correlations that @martinmodrak suggested might be a problem. You should definitely include residual correlations in the final model (the (1|p|ID) term).

Regarding the divergent transitions, do they occur if you just remove the individual-level random effect? Probably not, right? Can brms change the correlation prior? Did you explore that? I don’t know what the default is for a single correlation, but this would be the prior to change in order to check if the problem is in the correlation estimate. An LKJ prior with a high η should remove the divergent transition if there is too little information in the pedigree to estimate the genetic correlation. This appears to be the case, given the [-0.92, 0.95] posterior interval.

Also, are the data scaled in any way? Scaling the responses to have unit variance can also help with the computation.

Hope this helps, cheers!


I have fiddled about with multivariate animal models before. But never in brms and I’m not hugely certain about how the syntax for multivariate models works, if I have a chance I’ll take a look at it later and see if I can work something out.


Hi Diogo,

Thanks so much for your input! If you have additional time, I have follow up questions for you…

If the cor term is indeed the genetic correlation term, I guess my main question would be something that @martinmodrak alluded to: how can you interpret it? I suppose that is partially a biological question, but I am concerned about naively ignoring mathematical considerations. If the models are appropriate (i.e. no divergent transitions etc), are there other considerations I should make for the interpretation of the estimate? From what I understood, Martin described situations in which the correlation might get inflated (or deflated). Is this something you consider when interpreting the genetic correlation estimate?

Was there a reason you could not implement multivar models in brms? I did originally explore MCMCglmm, which I found easy to implement, and I found Pierre de Villemereuil’s tutorial extremely helpful. (And thanks for your tutorial suggestion–I had somehow not encountered that one!) However, with MCMCglmm, I had problems with prior sensitivity and ‘zero attraction’ of the posterior distributions, so Pierre suggested I try brms, which would allow me to use “gentler” priors and potentially circumvent these problems. I found that my brms estimates were indeed more robust than MCMCglmm to prior selection for data of both small sample size and low heritability.

I have not yet done this–in your opinion, is this necessary? I have done posterior-predictive checks on all univariate models and the suite of diagnostic tests as outlined in Martin’s and Jonah Gabry’s vignette, but I am not sure what is considered the best practice for publishable results. I’ve never done data simulations before (and it seems daunting!), but I will take a look at your scripts, thank you.

You’re correct: divergent transitions are gone if I remove individual-level random effect.

Taking your advice, I started exploring changes to the correlation prior. The default is lkj(1), and as I started increasing the shape parameter, the divergent transitions decreased. They do not, however, go completely away. I don’t have a profound understanding of prior specification, but I believe my other priors are generally reasonable:

priors <- c(set_prior("normal(0,1)", class = "sd", coef = "Intercept", group = "ID", resp="Resp1"), # also tried normal(0,.5) for sd and sigma classes. 
             set_prior("normal(0,1)", class = "sd", coef = "Intercept", group = "ID", resp="Resp2"),
             set_prior("normal(0,1)", class = "sigma", group = "", coef = "",resp="Resp1"),
             set_prior("normal(0,1)", class = "sigma", group = "", coef = "",resp="Resp2"),
             set_prior("normal(0, 10)", class = "Intercept", group = "", coef = "",resp="Resp1"),
             set_prior("normal(0, 10)", class = "Intercept", group = "", coef = "",resp="Resp2"),
             set_prior("lkj(20)", class = "cor", group = "", coef = "",resp="") # played with values >1, transitions persist at all values

Yes, they are centered at 0 and scaled so that SD=1.

Thanks again so much to everyone who is chiming in.


Yeah I’m also curious if there’s a reason why MCMCglmm would be preferable. I can’t think of any reason why brms couldn’t do the same thing but with additional flexibility, as you point out, but maybe there’s something that’s not occurring to me. Certainly Stan itself will give you more flexibility that MCMCglmm, but brms doesn’t enable all of Stan’s functionality so I suppose there could be stuff that requires writing the Stan code yourself rather than using brms.

1 Like

MCMCglmm is custom-made for these types of models, so for one thing it has efficient implementations for several routines that are necessary. So speed is one thing, although it might just be that for this example both MCMCglmm and brms run in similar time.


That’s basically what I was interested in. But still kind of hard to guess the cause. Just to fiddle around a bit more - what if you use both responses, but play with the correlation term (I would try at least (1 || ID), (1 |p| ID) and (1 | gr(ID, cov = A))) which of those give you divergences?

I am really working from my intuitions and not any rigorous math/simulations, so it is quite likely, I am wrong. I’ll still try to explain my intuitions as best as I can. The problem IMHO is (as I think @diogro alluded to) that you need (1 | p | ID) term to catch phenotypic correlations (assuming those are random among the population, not correlated with genes). If you only include (1 | p | gr(ID, cov = A)), the correlation coefficient might “soak up” some of the phenotypic correlation, as there is no other part of the model where it could be reflected. If you have a lot of individuals and the correlations in A are generally strong, then the (1 | p | gr(ID, cov = A)) IMHO will be less likely to “soak up” phenotypic correlation, as that would violate the correlation structure within each response (i.e. if the correlation is non-genetic and there is phenotypic variation in Resp1, you would have mice that are genetically closely related, phenotypically also dissimilar in Resp2 - this goes against the correlations in A and this variance might get attributed to overall sigma instead). If on the other hand you have few individuals and the correlations are weak, there is little penalty for phenotypic correlations contaminating the genetic ones.

A hopefully related example is that if you examined only monozygotic and heterozygotic twins, the correlations both within heterozygotic and homozygotic twins separately would both be a mix of phenotypic and genetic correlations. But if you can assume the phenotypic correlations are the same in both cases, you can interpret the difference between the two correlations as related to the genetic component. (1 | p | ID) could represent correlation that is “same” regardless of the genotype and so (1 | p | gr(ID, cov = A)) should capture the “rest” of the correlation. (once again, I hope I am not misunderstanding something fundamental here).

The problem is that having both (1 | p | gr(ID, cov = A)) (genetic correlation) and (1 | p | ID) (phenotypic correlation) in the same model sounds like a recipe for fitting issues unless you have tons of data (and maybe even if you have tons of data). Moreover, you already have fitting issues when using just one of those terms and adding more terms rarely makes a model better behaved…

EDIT: When using two correlation terms it would be necessary to let them differ, so the hypothetical full formula would be Sex + Site + (1 | p1 | gr(ID, cov = A)) + (1 | p2 | ID).

1 Like

I’m afraid I’m currently awash with deadlined COVID stuff, so I still can’t commit to digging into this too deeply. But on the genetic/non-genetic correlations, this it worth looking at:

Tutorial 2 has a bivariate animal model example, and I agree with everyone else that modelling the covariance/correlations in the residuals is important.


brms definitely takes a lot longer than MCMCglmm to run, for my models at least!

Hi Tracy,

100 observations is very few for quantitative genetic analyses. As a very vague rule of thumb, in pig populations, we used to reckon about 1000 for heritability estimation (ie univariate) and closer to 2000 for correlations. And in those commercial populations for most traits the same animals would be recorded for at least a large block of them.

Are the same mice recorded forr more than one trait? All traits? Some subset? That would influence the existence of covariances at the residual level.

Single observations per animal for each trait, or repeated?

And doing frequentist (ASREML, DFREML) analyses we would also build up any analyses - use univariate results as starting values for bivariates, bivariates as starting values for multivariate (just so you’re starting near to where you want to end up). I’m not sure how to do that in a Bayesian analysis framework.

I suspect you will struggle just because of data quantity. Sorry.


1 Like

…They all give me divergences haha.

Ok–I think I understand what you and @diogro are talking about now with the phenotypic and genetic correlation terms, thanks for clarifying! However, it looks like I can’t model a repeated group-level effect?
brms version 2.13.0
Error: Duplicated group-level effects are not allowed. Occured for effect 'Intercept' of group 'ID'.
Looks similar to a previously posed (unsolved) question here.

As the general consensus seems to be, the data seem too few to be able to fit these models really well. An alternative I will consider if I cannot build an appropriate bivariate model would be to report how large a sample I would need for this analysis to be possible, and I’ll use advice from this post as a starting point to understanding how to do this in a Bayesian framework.

1 Like

Hi Ron, thanks for the input. Yes, ours is an extremely small dataset compared to human genetics and laboratory studies. We are interested in adapting quantitative genetics methods to study wild populations. Because of the logistics of studying natural populations, researchers interested in ecologically meaningful traits and behaviors will generally have scarce data. There are beginning to be quant. genetics studies in small and wild populations, including heritability studies by Charmantier, Perrier, de Villemereuil and Morrissey.

As for the phenotypes, the mice included in the models are only ones who have both traits 1 and 2, and these are single observations (or collapsed into single observations).


It’s not clear to me why you are using the random effect term (1|p|gr(ID, cov=A)) instead of the simpler (1|gr(ID, cov=A)). You have only one random effect, you don’t need a specific id for them.

Can you please post the code you used for the MCMCglmm model?

I think those reveal the source of the divergences - for many families, individual level random effects can easily be non-identified when you have only one observation. For th e case of (1 || ID) you have gaussian likelihood and add i.i.d. gaussian to each row separately. But since gaussian + gaussian is another gaussian, there is no way to distinguish between the variance of the varying intercept and the variance of the likelihood, only their sum (and hence the sum of squares of the standard deviations) is identified. The additional covariance structure is then probably insufficient to constrain the posterior to avoid this degeneracy.

If I am right, you would expect the posterior to show a strong negative correlation between the fitted sd of the varying intercept and the sigma of the likelihood (as can been seen e.g. via pairs(fit, pars = ... )).

And if this is indeed the case, I fear the problem is insurmountable unless you would have multiple measurements for at least some of the individuals to let you distinguish between the measurement variability and between-mice variability. MCMCglmm has to encounter exactly the same problem, but (I’ve heard that) Gibbs samplers simply lack a good way to diagnose the problem, while with NUTS you get divergences that signal the issue.

Does that make sense?

Since this is a multivariate formula, it will make the effect correlated between the two responses (because mvbind(A,B) ~ something is basically a shorthand for bf(A ~ something) + bf(B ~ something) so the p is actually mentioned in both the formulae) - does that make sense?


Since this is a multivariate formula, it will make the effect correlated between the two responses (because mvbind(A,B) ~ something is basically a shorthand for bf(A ~ something) + bf(B ~ something) so the p is actually mentioned in both the formulae) - does that make sense?

Yes, thank you. Stupid me that I forgot to consider the translation into sum of two bf.