Hierarchical Correlation?

Nice. Yeah for various reasons I had it setup for matrix input, but in the mwe at least there’s no need.

I couldn’t see quickly where, but your code is broken. Edited out confusion: Just needed a tcrossprod and not crossprod.

1 Like

If anyone figures out a nice way to get the cholesky factor directly, instead of this dense matrix square root thing, I’d be very interested, stan still complains about non pd matrices far more often than I’d like (when cholesky decomposing) even when I construct it using this form that I think ‘should’ ensure pd.

1 Like

It works fine for me. Did you copy all of it? In each case I changed the loop and indexing.
You must make sure that the loop is what I put. I’m only traversing the lower tri.

Edit: So the crossprod in the ‘nomat’ solution needed to be tcrossprod. Updated output and attachments.

R output for the nonmatrix code (‘spinkney_scor_nomat.stan’)

> sp <- stan_model(file = "spinkney_scor_nomat.stan")
DIAGNOSTIC(S) FROM PARSER:
Info: integer division implicitly rounds to integer. Found int division: d * d - d / 2
 Positive values rounded down, negative values rounded up or down in platform-dependent way.
Info: integer division implicitly rounds to integer. Found int division: d * d - d / 2
 Positive values rounded down, negative values rounded up or down in platform-dependent way.

Info: Found int division at 'string', line 22, column 9 to column 20:
  (d * d - d) / 2
Values will be rounded towards zero. If rounding is not desired you can write
the division as
  (d * d - d) / 2.0
hash mismatch so recompiling; make sure Stan code ends with a blank line
> sp <- stan_model(file = "spinkney_scor_nomat.stan")

SAMPLING FOR MODEL 'spinkney_scor_nomat' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 2.5e-05 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.25 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 1200 [  0%]  (Warmup)
Chain 1: Iteration:  400 / 1200 [ 33%]  (Warmup)
Chain 1: Iteration:  401 / 1200 [ 33%]  (Sampling)
Chain 1: Iteration:  800 / 1200 [ 66%]  (Sampling)
Chain 1: Iteration: 1200 / 1200 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.013312 seconds (Warm-up)
Chain 1:                0.037642 seconds (Sampling)
Chain 1:                0.050954 seconds (Total)
Chain 1: 
> round(summary(sp_f, pars = "mcor")$summary, 2)
           mean se_mean   sd  2.5%   25%   50%  75% 97.5%  n_eff Rhat
mcor[1,1]  1.00    0.00 0.00  1.00  1.00  1.00 1.00  1.00 412.32    1
mcor[1,2]  0.00    0.02 0.59 -0.93 -0.53  0.01 0.56  0.91 723.59    1
mcor[1,3]  0.01    0.02 0.57 -0.92 -0.48  0.01 0.51  0.93 762.93    1
mcor[2,1]  0.00    0.02 0.59 -0.93 -0.53  0.01 0.56  0.91 723.59    1
mcor[2,2]  1.00    0.00 0.00  1.00  1.00  1.00 1.00  1.00 453.90    1
mcor[2,3] -0.01    0.02 0.54 -0.90 -0.46 -0.01 0.43  0.92 710.43    1
mcor[3,1]  0.01    0.02 0.57 -0.92 -0.48  0.01 0.51  0.93 762.93    1
mcor[3,2] -0.01    0.02 0.54 -0.90 -0.46 -0.01 0.43  0.92 710.43    1
mcor[3,3]  1.00    0.00 0.00  1.00  1.00  1.00 1.00  1.00 439.26    1

R output for the matrix output (spinkney_scor.stan)

> sp <- stan_model(file = "spinkney_scor.stan")
DIAGNOSTIC(S) FROM PARSER:
Info: integer division implicitly rounds to integer. Found int division: d * d - d / 2
 Positive values rounded down, negative values rounded up or down in platform-dependent way.
Info: integer division implicitly rounds to integer. Found int division: d * d - d / 2
 Positive values rounded down, negative values rounded up or down in platform-dependent way.

Info: Found int division at 'string', line 40, column 9 to column 20:
  (d * d - d) / 2
Values will be rounded towards zero. If rounding is not desired you can write
the division as
  (d * d - d) / 2.0
recompiling to avoid crashing R session
> sp_f <- sampling(sp, data = list(d = 3),
+                  seed = 1234,
+                  chains = 1,
+                  iter = 1200,
+                  warmup = 400,
+                  refresh = 400)

SAMPLING FOR MODEL 'spinkney_scor' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 2.3e-05 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.23 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 1200 [  0%]  (Warmup)
Chain 1: Iteration:  400 / 1200 [ 33%]  (Warmup)
Chain 1: Iteration:  401 / 1200 [ 33%]  (Sampling)
Chain 1: Iteration:  800 / 1200 [ 66%]  (Sampling)
Chain 1: Iteration: 1200 / 1200 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.008973 seconds (Warm-up)
Chain 1:                0.024096 seconds (Sampling)
Chain 1:                0.033069 seconds (Total)
Chain 1: 
> round(summary(sp_f, pars = "mcor")$summary, 2)
           mean se_mean   sd  2.5%   25%   50%  75% 97.5%  n_eff Rhat
mcor[1,1]  1.00    0.00 0.00  1.00  1.00  1.00 1.00  1.00 412.32    1
mcor[1,2]  0.00    0.02 0.59 -0.93 -0.53  0.01 0.56  0.91 723.59    1
mcor[1,3]  0.01    0.02 0.57 -0.92 -0.48  0.01 0.51  0.93 762.93    1
mcor[2,1]  0.00    0.02 0.59 -0.93 -0.53  0.01 0.56  0.91 723.59    1
mcor[2,2]  1.00    0.00 0.00  1.00  1.00  1.00 1.00  1.00 453.90    1
mcor[2,3] -0.01    0.02 0.54 -0.90 -0.46 -0.01 0.43  0.92 710.43    1
mcor[3,1]  0.01    0.02 0.57 -0.92 -0.48  0.01 0.51  0.93 762.93    1
mcor[3,2] -0.01    0.02 0.54 -0.90 -0.46 -0.01 0.43  0.92 710.43    1
mcor[3,3]  1.00    0.00 0.00  1.00  1.00  1.00 1.00  1.00 439.26    1 

spinkney_scor.stan (1.3 KB)
spinkney_scor_nomat.stan (700 Bytes)

1 Like

Cool yes. My vague memories of it needing to be iterative (I guess I maybe fixed / improved that in the algorithm at some point…) combined with crossprod confusion to create confusion :)

1 Like

Curious, are you referring to a case like this?

As in, you want a hierarchical correlation matrix, where there is a fixed and random component to group-specific correlation matrix?

For anyone finding their way here in the future, the approach discussed by myself and @spinkney gives consistent priors across indices of the correlation matrix and works well for generating data, and I had some success with it hierarchically, but in some circumstances – I think when the correlation signal is relatively strong – this approach seems to give difficulty with sampling and optimization, with lots of multimodality. I’ve updated the code, it performs much better and copes with higher correlations with less bias (much better than the stan lkj approach too it seems) but the downside is priors on the bivariate correlations are not exactly the same across indices of the matrix. It should be possible to correct this behaviour, but for the time being I’ll post the update with the ‘rough correction’ (divide by column index) I guesshacked. The following code compares my version to the LKJ approach, it doesn’t show the hierarchical form but that is hopefully clear enough – just recompute the correlation matrix for whatever vector of unconstrained correlation parameters are relevant. I guess @mike-lawrence will be pretty interested here.

Also, the inability to edit old posts to point them to updated info is a bit annoying…

require(rstan)

smt <- '
functions{
   matrix constraincorsqrt(vector rawcor, int d){ //converts from unconstrained lower tri matrix to cor
    matrix[d,d] o;
    real cormax=0;
    real rowsum;
    int counter = 0;
   
    for(i in 1:d){ //constrain and set upper tri to lower
      for(j in 1:d){
        if(j > i){
          counter += 1;
          o[j,i] = rawcor[counter] / i;
          o[i,j] = 0;//o[j,i];
        }
      }
      o[i,i] = 1; // smaller gives more prior weight to larger correlations, larger gives less
      o[i,] = o[i,] / sqrt(sum(square(o[i,])+1e-8));
    }
    return (o);
  }
}
data{
  int d;
  int n;
  vector[d] dat[n];
  real corpriorscale;
}
parameters{
  vector[(d * d - d) / 2] rawcor;
  vector[d] logsdvec;
  vector[d] mu;
}
transformed parameters{
  matrix[d,d] mcor;
  matrix[d, d] mcov;
  matrix[d, d] mcovchol;

  mcor = tcrossprod(constraincorsqrt(rawcor,d));
  mcov = quad_form_diag(mcor, exp(logsdvec) +1e-8);
  mcovchol = cholesky_decompose(mcov); 
}
model{
  rawcor ~ normal(0,corpriorscale);
  logsdvec ~ normal(0,10);
  mu ~ normal(0,10);
  dat ~ multi_normal_cholesky(mu,mcovchol);
}
'

sm <- stan_model(model_code = smt)


smtlkj <- '
data{
  int d;
  int n;
  vector[d] dat[n];
  real corpriorscale;
}
parameters{
  cholesky_factor_corr[d] cholcor;
  vector[d] logsdvec;
  vector[d] mu;
}
transformed parameters{
  matrix[d,d] mcor = tcrossprod(cholcor);
  matrix[d, d] mcov = quad_form_diag(mcor,exp(logsdvec));
  matrix[d, d] mcovchol = cholesky_decompose(mcov); 
}
model{
  cholcor ~ lkj_corr_cholesky(corpriorscale);
  logsdvec ~ normal(0,10);
  mu ~ normal(0,10);
  dat ~ multi_normal_cholesky(mu,mcovchol);
}
'

smlkj <- stan_model(model_code = smtlkj)



d=5
n=30
mcov <- diag(1,d)

#some high some low
mcov[lower.tri(mcov)][seq(1,(d^2-d)/2,d)] <- .8
# mcov[,1] <- mcov[,1]+3

#random
# mcov[lower.tri(mcov)]<- rnorm((d^2-d)/2)#.8

mcov[upper.tri(mcov)] <- t(mcov)[upper.tri(mcov)]
mcov <- cov2cor(mcov %*% t(mcov))
mcovchol <- t(chol(mcov))
mcor=cov2cor(mcov)
dat <- matrix(rnorm(d*n,0,1),n,d)
dat <- t(mcovchol %*% t(dat))
dat <- scale(dat)
round(cor(dat),3)
round(mcor,3)


sdat <- list(n=n,d=d, dat=dat,corpriorscale=10)
# sf <- ctsem:::stanWplot(sm,sdat,chains=4,cores=4,iter=2000)
ta<-Sys.time()
sf <- sampling(sm,sdat,chains=4,cores=4,iter=2000)
ta <- Sys.time()-ta
print(ta)
tb<-Sys.time()
# sflkj <- ctsem:::stanWplot(smlkj,sdat,chains=4,cores=4,iter=2000)
sflkj <- sampling(smlkj,sdat,chains=4,cores=4,iter=2000)
tb <- Sys.time()-tb
print(tb)

s=round(summary(sf)$summary,3)
s[grep('mcor',rownames(s)),]
slkj=round(summary(sflkj)$summary,3)
slkj[grep('mcor',rownames(slkj)),]
plot(s[grep('mcor',rownames(s)),1],pch=16,col=2,ylab='Corr.')
points(c(cor(dat)),pch=1,col='black')
# points(c(mcor),pch=16,col='blue')
points(slkj[grep('mcor',rownames(slkj)),1],pch=16,col='green')
points(s[grep('mcor',rownames(s)),4],col=2,pch=3)
points(s[grep('mcor',rownames(s)),8],col=2,pch=3)
points(slkj[grep('mcor',rownames(slkj)),4],pch=3,col='green')
points(slkj[grep('mcor',rownames(slkj)),8],pch=3,col='green')
legend('topright',c('True','CD','LKJ'),text.col=c(1,2,3),bty='n')

and in case anyone wants to do me a favour and figure out a proper solution, here is a simple script for testing:

ndim=5
mat=matrix(1,ndim,ndim)

corfunc=function(mat, invert){ 
  o=mat;
  d=nrow(o)
  for(i in 1:nrow(o)){ 
    for(j in 1:nrow(o)){
      if(j > i){
        o[j,i] = o[j,i] / (i)  #divide by i was just a rough guess that semi works
        o[i,j] = 0;
      }
    }
    o[i,i]=1 #adjust this value to modify propensity for high correlation strengths
    o[i,] = o[i,] / sqrt(sum((o[i,]^2)))
  }
  return(o)
}

m=tcrossprod(corfunc(mat,F))
mold
m
message('old mean = ',mean(mold[lower.tri(mold)]))
message('new mean = ',mean(m[lower.tri(m)]))
message('old diff = ',
sqrt(mean(( mold[lower.tri(mold)]-mean(mold[lower.tri(mold)]))^2)))
message('new diff = ',
sqrt(mean(( m[lower.tri(m)]-mean(m[lower.tri(m)]))^2)))
mold=m
1 Like

The simple script doesn’t work yet. Need to define mold

that gets defined the first time you run it, just to compare old vs new attempts while fiddling around :)

1 Like

If I do the spherical parameterization, I can get out the right “denominator” and the cholesky factor without needing the cholesky decompose.

Can you see if this does what you want?

functions{
  matrix angle_vec2mat (vector angle, int K) {
    int N = num_elements(angle);
    matrix[K, K] mat = add_diag(identity_matrix(K), rep_vector(-1, K));
    int count = K;
    int pos = 1;
    for (k in 1:K - 1){
      count -= 1;
      mat[k + 1:K, k] = segment(angle, pos, count);
      pos += count;
    }
    return mat;
  }
  
matrix angle2cor (vector angle, int N){
  matrix[N, N] angle_mat = angle_vec2mat(angle, N);
  matrix[N, N] inv_mat = identity_matrix(N);
  inv_mat[, 1] = cos(angle_mat[, 1]);
  
  for (i in 2:N) {
    inv_mat[i, i] = prod(sin(angle_mat[i, 1:(i - 1)]));
    if (i > 2) {
      for (j in 2:i - 1 ) {
        inv_mat[i, j] = cos(angle_mat[i, j]) * prod(sin(angle_mat[i, 1:(j - 1)]));
      }
    }
  }
  return inv_mat;
  }
}
data{
  int d;
  int n;
  vector[d] dat[n];
  real corpriorscale;
}
parameters{
  vector[(d * d - d) / 2] rawcor;
  vector[d] logsdvec;
  vector[d] mu;
}
transformed parameters{
  cholesky_factor_corr[d] mcor = angle2cor(rawcor,d);
}
model{
  rawcor ~ normal(0,corpriorscale);
  logsdvec ~ normal(0,10);
  mu ~ normal(0,10);
  dat ~ multi_normal_cholesky(mu, diag_pre_multiply(exp(logsdvec), mcor));
}

Here’s @Charles_Driver graph with this model

What do you guys think of the idea whereby no multivariate normal is modelled, but the target is incremented by the sampling distribution expectation of each bivariate correlation separately? That way, we can model each bivariate correlation hierarchically separately in a relatively straight-forward manner. Here’s a rough-but-compiles implementation of what I mean (the one thing I didn’t have time to work out is how to add my toggle trick for centered/non-centered parameterization, so it’s currently center-parameterized for all hierarchical quantities):

data{

	// num_subj: number of subj
	int<lower=1> num_subj ;

	// num_var: number of variables modelled as intercorrelated
	int<lower=2> num_var ;

	// num_obs: number of observations per variable and subject
	int<lower=1> num_obs ;

	// subj_obs: array of matrices, one per subject
	matrix[num_obs,num_var] subj_obs[num_subj] ;

}
transformed data{

	// num_cors: number of correlations implied by the number of variables
	int num_cors = ((num_var-1)*num_var)/2;

	// corz_se: standard error for correlations on Fisher's Z scale implied by the number of observations
	real corz_se = 1/sqrt(num_obs-3) ;

	// sqrt_num_obs: pre-computing this for use in likelihood
	real sqrt_num_obs = sqrt(num_obs) ;
	// sd_gamma_shape: pre-computing this for use in likelihood
	real sd_gamma_shape = (num_obs - 1) / 2 ;

	// compute the empirical means, SDs, variances, and correlations on Fisher's Z scale
	vector[num_var] obs_means[num_subj] ;
	vector[num_var] obs_sds[num_subj] ;
	vector[num_var] obs_variances[num_subj] ;
	vector[num_cors] obs_z[num_subj] ;
	for(i_subj in 1:num_subj){
		matrix[num_obs,num_var] this_subj_obs = subj_obs[i_subj] ; //temp variable
		for(i_var in 1:num_var){
			obs_means[i_subj][i_var] = mean(this_subj_obs[,i_var]);
			obs_sds[i_subj][i_var] = sd(this_subj_obs[,i_var]);
			obs_variances[i_subj][i_var] = sd(this_subj_obs[,i_var])^2;
		}
		int k = 0 ; //temp variable
		vector[num_cors] obs_r ; //temp variable
		// looping over lower-tri to compute correlation between each column in this_subj_obs
		for(this_var in 1:(num_var-1)){
			for(this_other_var in (this_var+1):num_var){
				k += 1 ;
				obs_r[k] = (
					sum(
						( this_subj_obs[,this_var] - obs_means[i_subj][this_var] )
						.* ( this_subj_obs[,this_other_var] - obs_means[i_subj][this_other_var] )
					)
				) / (
					(num_obs-1) * obs_sds[i_subj][this_var] * obs_sds[i_subj][this_other_var]
				) ;
			}
		}
		obs_z[i_subj] = atanh(obs_r) ;
	}
}
parameters{

	// mean_mean: mean (across subjects) mean
	vector[num_cors] mean_mean ;
	// mean_sd: sd (across subjects) mean
	vector<lower=0>[num_cors] mean_sd ;
	// mean_subj: by-subject mean
	vector[num_cors] mean_subj[num_subj] ;

	// logsd_mean: mean (across subjects) log-sd
	vector[num_cors] logsd_mean ;
	// logsd_sd: sd (across subjects) log-sd
	vector<lower=0>[num_cors] logsd_sd ;
	// sd_subj: by-subject sd
	vector<lower=0>[num_cors] sd_subj[num_subj] ;

	// corz_mean: mean (across subjects) correlations on Fisher's Z scale
	vector[num_cors] corz_mean ;
	// corz_sd: sd (across subjects) correlations on Fisher's Z scale
	vector<lower=0>[num_cors] corz_sd ;
	// corz_subj: by-subject correlations on Fisher's Z scale
	vector[num_cors] corz_subj[num_subj] ;

}
model{

	////
	// Priors
	////

	// Peaked-at-zero normal(0,1) prior on mean of means
	mean_mean ~ std_normal() ;
	// Peaked-at-zero normal(0,1) prior on sd of means
	mean_sd ~ std_normal() ;

	// Peaked-at-zero normal(0,1) prior on mean of logsds
	logsd_mean ~ std_normal() ;
	// Peaked-at-zero normal(0,1) prior on sd of logsds
	logsd_sd ~ std_normal() ;

	// normal(0,.74) priors on the mean correlations
	// yields flat prior on abs(cor)<.7, and diminishing to zero by abs(cor)==1
	// view in R via: hist(tanh(abs(rnorm(1e7)*.74)),br=1e4,xlim=c(0,1))
	corz_mean ~ normal(0,.74) ;
	// Peaked-at-zero normal(0,1) prior on variability of correlations
	// (would want to do PPC for this!)
	corz_sd ~ std_normal() ;

	// hierarchical structure:
	for(i_subj in 1:num_subj){
		mean_subj[i_subj] ~ normal(mean_mean,mean_sd) ;
		sd_subj[i_subj] ~ lognormal(logsd_mean,logsd_sd) ;
		corz_subj[i_subj] ~ normal(corz_mean,corz_sd) ;
	}

	////
	// Likelihood
	////

	for(i_subj in 1:num_subj){
		obs_means[i_subj] ~ normal( mean_subj[i_subj] , sd_subj[i_subj]/sqrt_num_obs ) ;
	    obs_variances[i_subj]  ~ gamma( sd_gamma_shape , sd_gamma_shape ./ pow(sd_subj[i_subj],2) ) ;
		obs_z[i_subj] ~ normal(corz_subj[i_subj],corz_se) ;
	}

}
generated quantities{

	// cor_mean: the upper-tri of the group-level-mean correlation matrix, flattened to a vector for efficient storage
	vector[num_cors] cor_mean = tanh(corz_mean) ;

}

Now, that’s encoding a model with independent hierarchies for everything, but it’d also be possible be possible to combine things with multivariate normal hierarchical structures, from just within-quantitiy-type (ex. currently the subject-level mean on each var is modelled as independent from their mean on the other vars, but the set of subject-level means could be treated as mvn) to doing everything as one big mvn (capturing things like subjects with a high mean on var1 also tend to have a high correlation between var2 and var3).

can you run that through charles’ simulation? We can take a look at the output from all the methods

Sure, though probably won’t have time to look back at this until the weekend.

1 Like

(note I edited my code post with a post-script)

@spinkney Is fast and looks fancy but the priors over bivariate correlations differ quite a lot depending on the column index. R code showing this is below – passing in a vector of 1’s results in different correlations in each column.

@mike-lawrence It might be nice for some situations, but doesn’t work when you actually need a correlation matrix without explicitly computing a correlation from something (as when using it for integration or non-centered parameterizations), and I’m also vaguely worried that the non-generative nature of it implies weird things that are hard to track down ;)

require(rstan)

smt<-'
functions{
  matrix angle_vec2mat (vector angle, int K) {
    int N = num_elements(angle);
    matrix[K, K] mat = add_diag(identity_matrix(K), rep_vector(-1, K));
    int count = K;
    int pos = 1;
    for (k in 1:K - 1){
      count -= 1;
      mat[k + 1:K, k] = segment(angle, pos, count);
      pos += count;
    }
    return mat;
  }
  
matrix angle2cor (vector angle, int N){
  matrix[N, N] angle_mat = angle_vec2mat(angle, N);
  matrix[N, N] inv_mat = identity_matrix(N);
  inv_mat[, 1] = cos(angle_mat[, 1]);
  
  for (i in 2:N) {
    inv_mat[i, i] = prod(sin(angle_mat[i, 1:(i - 1)]));
    if (i > 2) {
      for (j in 2:i - 1 ) {
        inv_mat[i, j] = cos(angle_mat[i, j]) * prod(sin(angle_mat[i, 1:(j - 1)]));
      }
    }
  }
  return inv_mat;
  }
}
data{
  int d;
  int n;
  vector[d] dat[n];
  real corpriorscale;
}
parameters{
  vector[(d * d - d) / 2] rawcor;
  vector[d] logsdvec;
  vector[d] mu;
}
transformed parameters{
  cholesky_factor_corr[d] mcorchol = angle2cor(rawcor,d);
}
model{
  rawcor ~ normal(0,corpriorscale);
  logsdvec ~ normal(0,10);
  mu ~ normal(0,10);
  dat ~ multi_normal_cholesky(mu, diag_pre_multiply(exp(logsdvec), mcorchol));
}
generated quantities{
  matrix[d,d] mcor = tcrossprod(mcorchol);
}
'


sm <- stan_model(model_code = smt)
expose_stan_functions(sm)

ndim=5
rawcor=rep(1,(ndim*ndim-1)/2)


m=tcrossprod(angle2cor(angle=rawcor,N=ndim))
m

It’s guaranteeing PSD so it will almost surely differ by column. Having data will get it close to true value and you shouldn’t get any issues after initial warmup of not being PSD.

When you call it in R it’s just using that “init”, I’m curious if this works for you with actual data and running iterations over the model?

@spinkney Well, the LKJ approach and the approach I first posted (ie years ago now) both guarantee PSD and don’t differ by column. My ‘semi fix’ also only weakly differs by column, and allows easy hierarchy with decent sampling / optimizing behaviour. The example code I posted is not just about the inits – if you imagine, for example, a normal prior on the rawcor parameters, the fact that some columns end up lower than others for the same input implies that the prior for the bivariate correlation is different.

I’ll have to give some further thought on how this can be incorporated mid-hierarchy while permitting non-centered parameterization of the variates the set of correlations apply to. I’d been thinking about that already and thought I had a plan, but you’ve given me pause and now I’m less confident.

What is the integration use case?

Yeah, I’m definitely worried about that too. I suspect the ultimate properly generative “solution” is to shift everything to an SEM framework with the “parameter expansion” trick to encode the correlations, but this independent-bivariates method is so fast I’m having trouble letting it go quite yet :P

Right, right. And now I get the issue more. Yea, I mean it guarantees it by this specific construction. This - as you said in your first post today - may take some magic to input a normal prior along each column to represent the same prior weight along each dimension.

1 Like

@mike-lawrence re integration, I use subject specific covariance matrices in a kalman filtering setup, for one example. I also like to be able to fix certain values to zero or to certain others, which makes the parameter expansion approach also pretty tricky I think! Oh, and I doubt you can optimize a parameter expanded form, and some of us like to do that ;)