Partial-pooling of correlation (or covariance) matrices?

I’m working on a model with a nested hierarchy and it’s straightforward to apply partial-pooling over things like location and scale parameters, but is there a way to apply the partial-pooling concept to correlation matrices too? I’m guessing it would actually have to be implemented in terms of covariance instead of correlations, but beyond that I have no intuitions.

1 Like

There has been some success in situations like this if you make the relevant correlation matrix a convex combination of a global correlation matrix and a deviation correlation matrix and estimate the weight on the global correlation matrix.


Posting here because it seems like my question is essentially the same, and there might have been updates in the year since this topic went cold.Is there an established way to deal with correlation matrices in nested hierarchical models? Alternatively, what are the pros and cons of the different possibilities? I cannot seem to find this explained in any of the stan manuals, or BDA3 for that matter.

I’m building a 3-level (individual/country/world) nested model describing an implementation of prospect theory. Each individual has 9 parameters describing their choices (think optimism, valuation of money, and loss aversion), and ideally I want to implement a correlation matrix over those parameters at the country level plus some way of pooling the information from those correlations on the world level.

For @bgoodri’s approach of the convex combination, does that largely equate to estimating an individual correlation matrix for each country? Is the difference in using the weight of the group-specific deviation matrix to encourage the deviation to be small? (as you mentioned here)

As for @Charles_Driver’s approach, I came across your discussion with spinkney and Stephen_Martin in Hierarchical Correlation? from earlier this year, but I’m not familiar enough with the Stan functions yet to really understand what’s happening in the code. Is there somewhere other than the article where it’s elaborated a bit more?

My model should be fine-ish with either a single world-level correlation matrix, or separate ones for each country, but both feel like leaving useful information on the table.

The main update from my side is a) the approach in the paper ‘works’ but turns out to have local minima issues in higher dimensions. b) to get around this I rolled a complete hack job that looks horrendous and has no theoretical justification but produces clean results on all the tests I could imagine in a fairly short timeframe, and still satisfies the requirements of 1 parameter per correlation and index independent priors. To get a correlation matrix from this function you pass it the vector of ‘unconstrained’ correlation parameters and dimension, then tcrossprod the matrix it returns. Writing some of these explorations up properly one day is vaguely on my to-do list, but quite a long way down unfortunately ;)

matrix constraincorsqrt(vector rawcor, int d){ //converts from unconstrained lower tri vec to cor sqrt
  int counter = 0;
  matrix[d,d] o;
  vector[d] ss = rep_vector(0,d);
  vector[d] s = rep_vector(0,d);
  real r;
  real r3;
  real r4;
  real r1;
  real r2;

  for(i in 1:d){ //set upper tri to lower
  for(j in 1:d){
    if(j > i){
      o[j,i] =  rawcor[counter];//inv_logit(rawcor[counter])*2-1; //divide by i for approx whole matrix equiv priors

  for(i in 1:d){
    for(j in 1:d){
      if(j > i) {
        ss[i] +=square(o[j,i]);
        s[i] +=o[j,i];
      if(j < i){
        ss[i] += square(o[i,j]);
        s[i] += o[i,j];

  for(i in 1:d){

    for(j in 1:d){
      if(j > i)  o[i,j]=o[j,i]/r;
      if(j < i) o[i,j] = o[i,j] /r;

  return o;

Thanks a lot!

I’ll definitely try and implement that.