Why LKJ prior for correlation between random effects? Can we set asymmetrical priors? e.g. nonnegative

I’m doing a project that uses brms (logit) for a Bayesian multilevel model, which has three levels (time, people, and company), and both people and company have random intercept and random slopes. For this particular setting, we have a prior that the correlation between the intercept and slope should be at least weakly positive, but since the default in brms is LKJ and the correlation will be symmetrical no matter what shape is defined. Can anyone help me understand:

  1. what is the rationale for brms to use LKJ prior?
  2. what can I do if I want to impose a prior distribution for correlation to be >0?

I’ve been searching for answers for a while but haven’t found any conclusion yet. Any hint is appreciated!


I think that it is because it appears to work reasonably well as a weakly-informative prior for many use cases. Nothing more, nothing less. In fact the LKJ prior has some weird properties (e.g. LKJ(1) us uniform over all correaltion matrices, but the marginal distribution for an individual correlation is not uniform over -1,1). Also there don’t appear to be many known prior distributions for correlation matrices with “neat” mathematical properties.

The best I’ve seen on this topic is: Informative priors for correlation matrices: An easy approach | Stephen R. Martin, PhD It feels like a hack (because it IMHO is a bit hacky), but I am not aware of anything better.

Best of luck with your model!


Is it possible to have it both ways, though? Can you have a uniform probability measure over all of the correlation matrices and uniform marginals?

1 Like

I don’t understand this stuff well,but it appears that those two properties are contradictory.


Does “a uniform probability measure over all of the correlation matrices” not define a unique probability distribution, which is the one given by LKJ(1)?

Yep, that’s what I was trying to get at.

Think so, good point. So the answer to my question is ‘Nope, you can’t have your uniform cake and eat it uniformly too’.

1 Like

Stephen Martin provides a good explanation of why a uniform prior over correlation matrices cannot lead to a uniform marginal distribution for any correlation coefficient (unless we’re dealing only with 2 dimensions):

The answer lies in the constraints of the correlation matrix. Correlation matrices must be symmetric, and positive semi-definite (PSD). That PSD constraint alters where probability mass can exist, and where it will accumulate marginally. When you only have variables, this is not an issue. But when you have variables, then one correlation’s value constrains what other correlations can be, if the psd constraint is to be met. I.e., you cannot create a correlation matrix, and fill the off-diagonals with uniform(-1,1) values and expect it to be PSD. As K increases, the chances of creating a correlation matrix from uniformly distributed elements that is also PSD rapidly increases near-zero.

You may want to know what the marginal priors for each correlation are, given a uniform LKJ and K variables. A useful result is provided, as always, by Ben Goodrich. He stated:
In general, when there are K variables, then the marginal distribution of a single correlation is Beta on the (-1,1) interval with both shape parameters equal to K / 2.