# Multi-group Confirmatory Factor Analysis: how to handle the Cholesky factor correlations?

I started with some code from Mike Laurence and Michael de Witt that I modified to include an endogenous latent variable. I want to now make this multi-group by time so I can make the parameter factor loadings evolve over time. I am unsure how I should best incorporate multiple groups in this model. After that I think I should be able to add a dynamic component with a state-space model.

I got to the cholesky_factor_corr parameter declaration and got stumped at how to proceed vectorizing it for time/group. Any suggestions on how to perform this analysis by groups would be highly appreciated.

//confirmatory factor analysis code
data{
// n_subj: number of subjects
int n_subj ;
// n_y: number of outcomes
int n_y ;
// y: matrix of outcomes
matrix[n_subj,n_y] y ;
// n_fac: number of latent factors level one
int n_fac_one ;
// y_fac: list of which factor is associated with each outcome
int<lower=1,upper=n_fac_one> y_fac_one[n_y] ;
// n_fac: number of latent factors level two
int n_fac_two ;
// y_fac: list of which factor is associated with each outcome
int<lower=1,upper=n_fac_two> y_fac_two[n_fac_one] ;
}
transformed data{
// scaled_y: z-scaled outcomes
matrix[n_subj,n_y] scaled_y ;
for(i in 1:n_y){
scaled_y[,i] = (y[,i]-mean(y[,i])) / sd(y[,i]) ;
}

}
parameters{
// normal01: a helper variable for more efficient non-centered multivariate
//  sampling of subj_facs
matrix[n_fac_one, n_subj] normal01;
matrix[n_fac_two, n_subj] normal02;
// fac_cor_helper: correlations (on cholesky factor scale) amongst latent
//  factors
cholesky_factor_corr[n_fac_one] fac_cor_helper_one ;
cholesky_factor_corr[n_fac_two] fac_cor_helper_two ;
// scaled_y_means: means for each outcome
vector[n_y] scaled_y_means ;
// scaled_y_noise: measurement noise for each outcome
vector<lower=0>[n_y] scaled_y_noise_one ;
vector<lower=0>[n_y] scaled_y_noise_two ;
// betas: how each factor loads on to each outcome.
//  Must be postive for identifiability.
vector<lower=0>[n_y] betas_one ;
vector<lower=0>[n_y] betas_two ;
}
transformed parameters{
// subj_facs: matrix of values for each factor associated with each subject
matrix[n_subj,n_fac_one] subj_facs_one ;
matrix[n_subj,n_fac_two] subj_facs_two ;
// the following calculation implies that rows of subj_facs are sampled from
//  a multivariate normal distribution with mean of 0 and sd of 1 in all
//  dimensions and a correlation matrix of fac_cor.
subj_facs_one = transpose(fac_cor_helper_one * normal01) ;
subj_facs_two = transpose(fac_cor_helper_two * normal02) ;
}
model{
// normal01 must have normal(0,1) prior for "non-centered" parameterization
//  on the multivariate distribution of latent factors
to_vector(normal01) ~ normal(0,1) ;
to_vector(normal02) ~ normal(0,1) ;
// flat prior on correlations
fac_cor_helper_one ~ lkj_corr_cholesky(1) ;
fac_cor_helper_two ~ lkj_corr_cholesky(1) ;
// normal prior on y means
scaled_y_means ~ normal(0,1) ;
// weibull prior on measurement noise
scaled_y_noise_one ~ weibull(2,1) ;
scaled_y_noise_two ~ weibull(2,1) ;
// normal(0,1) priors on betas
betas_one ~ normal(0,1) ;
betas_two ~ normal(0,1) ;
// each outcome as normal, influenced by its associated latent factor
for(i in 1:n_y){
scaled_y[,i] ~ normal(
scaled_y_means[i] + subj_facs_one[,y_fac_one[i]] * betas_one[i]
, scaled_y_noise_one[i]
) ;
}

// each outcome as normal, influenced by its associated latent factor level two
for(i in 1:n_fac_one){
subj_facs_one[,i] ~ normal(
subj_facs_two[,y_fac_two[i]] * betas_two[i]
, scaled_y_noise_two[i]
) ;
}
}
generated quantities{
//cors: correlations amongst within-subject predictors' effects
corr_matrix[n_fac_one] fac_cor_one ;
corr_matrix[n_fac_two] fac_cor_two ;
// y_means: outcome means on the original scale of Y
vector[n_y] y_means ;
// y_noise: outcome noise on the original scale of Y
vector[n_y] y_noise ;
fac_cor_one = multiply_lower_tri_self_transpose(fac_cor_helper_one) ;
fac_cor_two = multiply_lower_tri_self_transpose(fac_cor_helper_two) ;
for(i in 1:n_y){
y_means[i] = scaled_y_means[i]*sd(y[,i]) + mean(y[,i]) ;
y_noise[i] = scaled_y_noise_one[i]*sd(y[,i]) ;
}
}


model{
vector[N] mu = alpha+beta*x;
y~normal(mu,sigma);
}

To include mathematical notation in your post put LaTeX syntax between two \$ symbols, e.g.,
p(\theta | y) \propto p(\theta) p(y | \theta).

Hi mansillo, if this is a repeated measures factor analysis where you want to allow for correlations between the latent factors over time, as well as residual correlations between the same item measured over time, Iâ€™ve got a stan model that Iâ€™ve used for that. Youâ€™ll need to adapt it for your use-case, but hopefully it shows a way to do what youâ€™re after.

Itâ€™s for a 1-factor model measured T times, like the diagram below (except with latent correlations, rather than regressions):

Code:

data{
int N;              //Number of individuals
int J;              //Number of items
int T;              //Number of timepoints
vector[J*T] y[N];
}

transformed data{
/* Cache some calculations and structures so they don't get
re-evaluated at each iteration */
int J_T = J*T;
int J_Tm1 = J*(T-1);
matrix[J_T,J_T] corr_diag = diag_matrix(rep_vector(1,J_T));
vector[J_Tm1] zero_load = rep_vector(0, J_Tm1);
}

parameters{
/* 'Raw' Variables for Non-Centered Parameterisation */
matrix[J,T] intercepts_r;
matrix[N,T] eta_r;

vector[J] intercepts_m;
vector<lower=0>[J] intercepts_sd;

/* Correlations between factors over time,
item residual variances, and residual correlations */
cholesky_factor_corr[T] e_Lcorr;
vector<lower=0>[J_T] resvar;
cholesky_factor_corr[2] y_Lcorr[J_Tm1];
}

transformed parameters{
matrix[T,J] intercepts;
cholesky_factor_corr[J_T] y_corr = corr_diag;

/* Transform 'raw' loadings to required distribution */
for(j in 1:J)
for(j in 1:J)
intercepts[,j] = (intercepts_m[j] + intercepts_r[j] * intercepts_sd[j])';

/* Pack residual correlations into single matrix */
for(c in 1:J_Tm1) {
y_corr[c+J,c] = y_Lcorr[c][2,1];
y_corr[c+J,c+J] = y_Lcorr[c][2,2];
}
}

model{
/* These are temporary variables, used for holding
functions of other parameters. More efficient on
memory to include here as their posteriors aren't saved */
matrix[J_T, T] lam;
matrix[N,T] eta_t;
matrix[N,J_T] lam_eta;
row_vector[J_T] eta_vec[N];
row_vector[J_T] inter;

/* Priors for factor and residual correlations */
e_Lcorr ~ lkj_corr_cholesky(1);
for(c in 1:J_Tm1)
y_Lcorr[c] ~ lkj_corr_cholesky(1);

/* Priors for intercept and loading hyperparameters */
intercepts_m ~ normal(0,5);
intercepts_sd ~ cauchy(0,5);

/* Priors for 'raw' parameters for non-centered parameterisation */
to_vector(intercepts_r) ~ std_normal();
to_vector(eta_r) ~ std_normal();

for(t in 2:(T-1))
lam[,t] = append_row(append_row(rep_vector(0, J*(t-1)), load[t]'),
rep_vector(0, J*(T-t)));

/* Priors for residual variances */
resvar ~ cauchy(0,5);

/* Transform 'raw' factor scores with multivariate
non-centered parameterisation */
for(n in 1:N)
eta_t[n] = (e_Lcorr*eta_r[n]')';

/* Create the E[Y] for each individual */
lam_eta = eta_t*lam';
inter = to_row_vector(intercepts);
for(n in 1:N)
eta_vec[n] = lam_eta[n] + inter;

/* Link to observed data, with a multivariate normal distribution
for each pair of correlated observations */
y ~ multi_normal_cholesky(eta_vec, diag_pre_multiply(resvar,y_corr));
}

generated quantities {
/* Transform factor correlations from Cholesky to Correlation matrix */
corr_matrix[T] corr = multiply_lower_tri_self_transpose(e_Lcorr);
vector[J_Tm1] ycorrs;

/* Put variance parameters into variance scale (from SD) */
vector[J_T] vars = square(resvar);

/* More memory efficient to store only auto-correlations in vector
- since the non-autocorrelations are zero */
for(c in 1:J_Tm1)
ycorrs[c] = multiply_lower_tri_self_transpose(y_Lcorr[c])[2,1];
}


And some toy data here: data100.R (25.9 KB)

Hope that all helps!

2 Likes

Thanks @andrjohns you certainly helped me out but this presents me with another problem which is a general confirmatory analysis question for how to parametrise second and higher-order factors. I think this revision to the code I initially posted handles multiple groups and also links each of the groups with an index for time:

//confirmatory factor analysis code
data{
// n_t_obs: number of time points observed
int<lower = 0> n_t_obs ;
// n_t_miss: number of time points missing
int<lower = 0> n_t_miss ;
// T: number of time points
int<lower = 1, upper = n_t_obs + n_t_miss> T ;
// ii_t_obs: index of time points observed
int<lower = 1> ii_t_obs[n_t_obs] ;
// ii_t_miss: index of time points missing
int<lower = 1> ii_t_miss[n_t_miss] ;
// n_subj: number of subjects
int n_subj ;
// n_y: number of outcomes
int n_y ;
// y: matrix of outcomes
matrix[n_subj,n_y] y ;
// y_t: index of time points for observed outcomes
int y_t[n_y] ;
// n_fac: number of latent factors level one
int n_fac_one ;
// y_fac: list of which factor is associated with each outcome
int<lower=1,upper=n_fac_one> y_fac_one[n_y] ;
// n_fac: number of latent factors level two
int n_fac_two ;
// y_fac: list of which factor is associated with each outcome
int<lower=1,upper=n_fac_two> y_fac_two[n_fac_one] ;
}
transformed data{
// scaled_y: z-scaled outcomes
matrix[n_subj,n_y] scaled_y ;
for(i in 1:n_y){
scaled_y[,i] = (y[,i]-mean(y[,i])) / sd(y[,i]) ;
}
}
parameters{
// normal01: a helper variable for more efficient non-centered multivariate
// sampling of subj_facs
matrix[n_fac_one, n_subj] normal01[n_t_obs];
matrix[n_fac_two, n_subj] normal02[n_t_obs];
// fac_cor_helper: correlations (on cholesky factor scale) amongst latent
// factors
cholesky_factor_corr[n_fac_one] fac_cor_helper_one[n_t_obs] ;
cholesky_factor_corr[n_fac_two] fac_cor_helper_two[n_t_obs] ;
// scaled_y_means: means for each outcome
matrix[n_y,n_t_obs] scaled_y_means ;
// scaled_y_noise: measurement noise for each outcome
matrix<lower=0>[n_y,n_t_obs] scaled_y_noise_one ;
matrix<lower=0>[n_y,n_t_obs] scaled_y_noise_two ;
// betas: how each factor loads on to each outcome.
// Must be postive for identifiability.
matrix<lower=0>[n_y,n_t_obs] betas_one ;
matrix<lower=0>[n_y,n_t_obs] betas_two ;
}
transformed parameters{
// subj_facs: matrix of values for each factor associated with each subject
matrix[n_subj,n_fac_one] subj_facs_one[n_t_obs] ;
matrix[n_subj,n_fac_two] subj_facs_two[n_t_obs] ;
matrix[n_y,T] beta_one ;
matrix[n_y,T] beta_two ;
// the following calculation implies that rows of subj_facs are sampled from
// a multivariate normal distribution with mean of 0 and sd of 1 in all
// dimensions and a correlation matrix of fac_cor.
for(j in 1:n_t_obs){
subj_facs_one[y_t[j]] = transpose(fac_cor_helper_one[j] * normal01[y_t[j]]) ;
subj_facs_two[y_t[j]] = transpose(fac_cor_helper_two[j] * normal02[y_t[j]]) ;
}
for(i in 1:n_t_obs){
beta_one[,ii_t_obs[i]] = betas_one[,i];
beta_two[,ii_t_obs[i]] = betas_two[,i];
}
}
model{
// normal01 must have normal(0,1) prior for â€śnon-centeredâ€ť parameterization
// on the multivariate distribution of latent factors
to_vector(normal01[n_t_obs]) ~ normal(0,1) ;
to_vector(normal02[n_t_obs]) ~ normal(0,1) ;
for(j in 1:n_t_obs){
// flat prior on correlations
fac_cor_helper_one[j] ~ lkj_corr_cholesky(1.0) ;
fac_cor_helper_two[j] ~ lkj_corr_cholesky(1.0) ;
}
for(i in 1:n_t_obs){
// normal prior on y means
scaled_y_means[,i] ~ normal(0,1) ;
}
for(i in 1:n_t_obs){
// weibull prior on measurement noise
scaled_y_noise_one[,i] ~ weibull(2,1) ;
scaled_y_noise_two[,i] ~ weibull(2,1) ;
}
// normal(0,1) priors on betas initialisation
betas_one[1] ~ normal(0,1) ;
betas_two[1] ~ normal(0,1) ;
// normal(0,1) priors on betas with a random-walk
for(t in 2:T){
betas_one[t] ~ normal(beta_one[t-1],1) ;
betas_two[t] ~ normal(beta_two[t-1],1) ;
}
// each outcome as normal, influenced by its associated latent factor
for(j in 1:n_t_obs){
for(i in 1:n_y){
scaled_y[y_t[j],i] ~ normal(
scaled_y_means[i,j] + subj_facs_one[y_t[j],y_fac_one[i]] * betas_one[i,j]
, scaled_y_noise_one[i,j]
) ;
}
}
// each outcome as normal, influenced by its associated latent factor level two
for(j in 1:n_t_obs){
for(i in 1:n_fac_one){
subj_facs_one[y_t[j],i] ~ normal(
subj_facs_two[y_t[j],y_fac_two[i]] * betas_two[i,j]
, scaled_y_noise_two[i,j]
) ;
}
}
}
generated quantities{
//cors: correlations amongst within-subject predictorsâ€™ effects
corr_matrix[n_fac_one] fac_cor_one[n_t_obs] ;
corr_matrix[n_fac_two] fac_cor_two[n_t_obs] ;
// y_means: outcome means on the original scale of Y
matrix[n_y,n_t_obs] y_means ;
// y_noise: outcome noise on the original scale of Y
matrix[n_y,n_t_obs] y_noise ;
fac_cor_one[n_t_obs] = multiply_lower_tri_self_transpose(fac_cor_helper_one[n_t_obs]) ;
fac_cor_two[n_t_obs] = multiply_lower_tri_self_transpose(fac_cor_helper_two[n_t_obs]) ;
for(j in 1:n_t_obs){
for(i in 1:n_y){
y_means[i,j] = scaled_y_means[i,j]*sd(y[,i]) + mean(y[,i]) ;
y_noise[i,j] = scaled_y_noise_one[i,j]*sd(y[,i]) ;
}
}
}

The confirmatory factor analysis (CFA) model with a single order is a natural extension of the exploratory factor analysis (EFA) model; defined by
x = \Lambda \xi + \epsilon,
where x, \Lambda, \xi and \epsilon are the same as the EFA model. Common factors are allowed to be correlation so \xi is normally distributed as N [0,\Phi] with \Phi a positive definite covariance matrix which assumed that \xi is independent with \epsilon, the covariance matrix of x, giving
\Sigma = \Lambda \Phi \Lambda^{T}+\Psi_{\epsilon}.

So as the CFA model is

x = \Lambda\xi+\epsilon for the first order this should be fine with indexes for columns and time points of observations:

// each outcome as normal, influenced by its associated latent factor
for(j in 1:n_t_obs){
for(i in 1:n_y){
scaled_y[y_t[j],i] ~ normal(
scaled_y_means[i,j] + subj_facs_one[y_t[j],y_fac_one[i]] * betas_one[i,j]
, scaled_y_noise_one[i,j]
) ;
}
}

However, I do not know how to make the second-order of the model:

x = B(\Lambda\xi+\epsilon)+\delta =B\Lambda\xi+B\epsilon+\delta.