# Hierarchical prior (for partial pooling) on correlation matrices?

#21

Does this work?

data {
int N;
row_vector[9] bfi[4,N];
}

parameters {
real<lower=0,upper=1> alpha; //Pooling coefficient
cholesky_corr_matrix[9] R_mu_L;
cholesky_corr_matrix[9] R_diff_L[4];
}

transformed parameters {
corr_matrix[9] R_mu = multiply_lower_tri_self_transpose(R_mu_L);
corr_matrix[9] R_group[4];
for (i in 1:9) R_mu[i,i] = 1; // numerical correction
for(g in 1:4){
R_group[g] = (1 - alpha)*R_mu
+ alpha*multiply_lower_tri_self_transpose(R_diff_L[g]);
}
}

model {
R_mu_L ~ lkj_corr_cholesky(3);
alpha ~ beta(2,2);
for(g in 1:4){
R_diff_L[g] ~ lkj_corr_cholesky(1/alpha);
bfi[g] ~ multi_normal(rep_vector(0.0,9),R_group[g]);
}
}


#22

I just tried something similar, but more specific to my case (where there are unequal numbers of correlations across groups/samples).

I estimated R_diff_L[3] (three samples/groups) as a 9x9, even though 2 of the three groups will only use an inner 6x6. That worked.

So the issue comes when you want to augment a matrix.

You CAN:

1. Estimate 9x9 R_mu
2. Estimate three 9x9 R_diff matrices
3. Convex combine each of three R_diff with 9x9 R_mu
4. For 2 of the 3 samples, use the inner 6x6 matrix in multi_normal

You CANNOT:

1. Estimate 9x9 R_mu
2. Estimate 3 6x6 R_diff matrices
3. Convex combine each of the three R_diff with the inner 6x6 R_mu matrix
4. Use that 6x6 for the first 2 samples
5. Augment the 3rd sample’s 6x6 matrix to a 9x9 by using respective R_mu values; e.g., R_group_nine[7,8] = R_mu[7,8]

The latter produces non-positive definiteness.

I’ll check to see whether the results make sense. Is it a problem that the correlations in rows/columns 7-9 are estimated from R_mu and the three R_group matrices, even though only group 3 makes use of those correlations?

#23

So, this revision does not work?

data {
int N;
row_vector[9] bfi[4,N];
}

parameters {
real<lower=0,upper=1> alpha; //Pooling coefficient
cholesky_corr_matrix[9] R_mu_L;
cholesky_corr_matrix[6] R_diff_L[3];
cholesky_corr_matrix[9] R_diff_L2;
}

transformed parameters {
corr_matrix[9] R_mu = multiply_lower_tri_self_transpose(R_mu_L);
corr_matrix[6] R_group[3];
corr_matrix[9] R_group2 = (1 - alpha) * R_mu + alpha *
multiply_lower_tri_self_transpose(R_diff_L2);
for(g in 1:3){
R_group[g] = (1 - alpha)*R_mu[1:6,1:6]
+ alpha*multiply_lower_tri_self_transpose(R_diff_L[g]);
}

}


#24

That worked.

Ok, so adding to the can list; for future self:

CAN +=

1. Estimate 9x9 R_mu
2. Estimate 1 9x9 R_diff matrices; and 2 6x6 R_diff matrices
3. Convex combine 9x9 with R_mu
4. Convex combine 6x6 matrices with R_mu[1:6,1:6]

#25

If I did estimate all 9x9 R_diff matrices, even though for two, only the inner 6x6 is used, does that affect the estimates?
I know it’s inefficient, because it’s needless sampling.
However, it would be /cleaner/, for post-processing reasons.

#26

With proper priors, it is fine.

#27

Thank you for all your input - Seriously, a life saver.

Also for future me (or anyone who wants to do this simply), I wrote two functions that helps compute the per-group correlation matrices (or choleskies).
It takes a global matrix, a local diff matrix, and alpha.
If the local matrix is smaller than the global matrix, then it uses the inner-most subset from the global matrix.
E.g., if you have a 9x9 global matrix, but a 6x6 local matrix, then it will use global[1:6,1:6]; this means the function requires that the local matrix is literally a top-right-corner subset of the global matrix.

	matrix convex_combine_cholesky(matrix global_chol_cor, matrix local_chol_cor, real alpha){
int dim = rows(local_chol_cor);
int global_dim = rows(global_chol_cor);
matrix[global_dim,global_dim] global_cor = multiply_lower_tri_self_transpose(global_chol_cor);
matrix[dim,dim] local_cor = multiply_lower_tri_self_transpose(local_chol_cor);
matrix[dim,dim] global_cor_use;
matrix[dim,dim] combined_chol_cor;

if(dim < rows(global_cor)){
global_cor_use = global_cor[1:dim,1:dim];
} else {
global_cor_use = global_cor;
}
combined_chol_cor = cholesky_decompose((1 - alpha)*global_cor_use + (alpha)*local_cor);
return(combined_chol_cor);
}

matrix convex_combine(matrix global_cor, matrix local_cor, real alpha){
int dim = rows(local_cor);
matrix[dim,dim] global_cor_use;
matrix[dim,dim] combined_cor;
if(dim < rows(global_cor)){
global_cor_use = global_cor[1:dim,1:dim];
} else {
global_cor_use = global_cor;
}
combined_cor = (1 - alpha)*global_cor_use + alpha*local_cor;
return(combined_cor);
}


To use it, you need to declare a global matrix, which should be the same dimension as the largest group matrix; group-specific ‘diff’ matrices; and alpha.
It looks like this:

parameters {
cholesky_factor_corr[9] R_global_L;
cholesky_factor_corr[9] R_diff_L[4];
real<lower=0,upper=1> alpha;
}
transformed parameters {
cholesky_factor_corr[9] R_group_L[4];
for(g in 1:4){
R_group_L[g] = convex_combine_cholesky(R_global_L,R_diff_L[g],alpha);
}
}
model {
alpha ~ beta(2,2)
R_global_L ~ lkj_corr_cholesky(3);
for(g in 1:4){
R_diff_L[g] ~ lkj_corr_cholesky(1/alpha);
data[g] ~ multi_normal_cholesky(muVector, R_group_L[g]);
}

}

#28

The problem is that you were using \BoldSymbol or something like that. I switched to \mathbf so it’d render.