Faster CFA without data augmentation

simonbrauer · October 28, 2022, 12:26pm

You might consider the SEM approach of modelling the covariance matrix rather than the individual observations. Given sample size n, sample covariance matrix S, and model-implied covariance matrix \Sigma, the log-likelihood is proportional to \log L = -\frac{1}{2} n \left[ \log \lvert\Sigma\rvert + tr(S \Sigma^{-1}) \right]. See Hoyle 2012 “Handbook of Structural Equation Modeling”, page 21.

Alternatively, you could use the Wishart distribution to model the scatter matrix (Q below), from which the equation above is derived.

data{
    int N; // Number of observations
    int P; // Number of indicators
    matrix[N,P] Y;  // Raw data
}
transformed data{
    matrix[N,P] Y_center;
    matrix[P,P] Y_scatter;

    for(p in 1:P){
        Y_center[,p] = Y[,p] - mean(Y[,p]);
    }

    Y_scatter = Y_center' * Y_center;
}
parameters{
    cov_matrix[P] Sigma;  // Model-implied covariance matrix
}
model{
    Y_scatter ~ wishart(N, Sigma);
}

Neither of these model the mean structure, though I suspect it may speed up the estimation of the covariance structure.

Topic		Replies	Views
Speed-up for Mixed Discrete-Continuous Gaussian Copula - Reduce Sum? Modeling techniques , fitting-issues , specification , performance , cmdstanr	11	1076	February 1, 2023
CFA model cannot recover simulated values Modeling fitting-issues	5	889	June 23, 2022
Non-convergence of latent variable model Modeling	28	6452	February 13, 2020
Specification of Bayesian SEM models with a data-augmentation approach Modeling rstan , fitting-issues , specification	26	3664	December 9, 2020
Convergence problems with factor loadings in multivariante autoregressive model Modeling rstan , fitting-issues , specification , ecology	11	1690	July 31, 2020

Faster CFA without data augmentation

Related topics