I have a fairly standard hierarchical model, multiple observations made in each of multiple conditions specified by a set of experimental manipulations made within each of multiple human participants. I explicitly model the correlation matrix describing variability in how the manipulations affect each participant, but I have a lot of manipulations so, as explained to me in this post, I observe posterior distributions for these correlations to all roughly centered very near zero. I have pretty confident expectation for some of the correlations to be non-zero with decent magnitude (up to .5-ish) so I thought I’d check the sample-by-sample correlations amongst the actual values sampled for each participant. That is, for each sample from the posterior, the hierarchical model provides estimates for each participant for the effect of a given manipulation, so for a given pair of manipulations I can compute the correlation for that sample, then repeat for all samples, yielding a posterior distribution on the correlation. When I do this, I obtain posterior distributions on the correlations that very much match my expectations. The question is: is this a valid approach? Or, possibly more precisely, what (if any) inference can be drawn from these computed-from-the-posterior correlations since they seem to diverge from the explicit correlations with which the model was parameterized?
In case it helps, my model is below. I’m referring to a difference in the posterior on the cors
parameter compared to when I compute the correlations in the posterior amongst the columns in the Sdevs
parameter.
data {
// nY: num trials
int<lower=1> nY ;
// Y: Y outcomes (should be scaled to mean=0,sd=1 for priors to make sense!)
vector[nY] Y ;
// nS: num subjects
int<lower=1> nS ;
// S: trial-by-trial subject labels for observations
int<lower=1,upper=nS> S[nY] ;
// rowsW: num rows in W
int<lower=1> rowsW ;
// nW: num within predictors
int<lower=1> nW ;
// W: within predictor matrix (unique entries only)
matrix[rowsW,nW] W ;
// indexW: index of each subject in the flattened W
int indexWSubj[nY] ;
// nB: num group predictors
int<lower=1> nB ;
// B: between predictors
matrix[nS,nB] B ;
}
parameters {
// multi_normal_helper: a helper variable
matrix[nW,nS] multi_normal_helper ;
// sds: population-level sds (on the z-scale) for each within-subject predictor
vector<lower=0>[nW] sds ;
// cor_helper: population-level correlations (on cholesky factor scale) amongst within-subject predictors
cholesky_factor_corr[nW] cor_helper ;
// coef: coefficientsfor between and within subject predictors
matrix[nB,nW] coef ;
// noise: measurement noise
real noise ;
}
transformed parameters{
// Sdevs: subject-by-subject deviations
matrix[nS,nW] Sdevs ;
// compute
Sdevs = transpose(diag_pre_multiply(sds,cor_helper) * multi_normal_helper) ;
}
model {
// multi_normal_helper must have normal(0,1) prior for multivariate trick
to_vector(multi_normal_helper) ~ normal(0,1) ;
// flat prior on correlations
cor_helper ~ lkj_corr_cholesky(1) ;
// normal(0,1) priors on all sds
sds ~ normal(0,1) ;
// normal(0,1) priors on all coefs
to_vector(coef) ~ normal(0,1) ;
// weibull(2,1) prior on noise
noise ~ weibull(2,1) ;
{
// Svals: subject coefficient values
matrix[nS,nW] Svals ;
// mu: condition-by-subject values
matrix[rowsW,nS] mu ;
Svals = B * coef + Sdevs ;
for(i in 1:nS){
for(j in 1:rowsW){
mu[j,i] = dot_product(Svals[i],W[j]) ;
}
}
Y ~ normal( to_vector(mu)[indexWSubj] , noise ) ;
}
}
generated quantities {
// cor: population-level correlations amongst within-subject predictors
corr_matrix[nW] cor ;
cor = multiply_lower_tri_self_transpose(cor_helper) ;
}