Hello Stan Community,
I model the composition of n groups as a sum-to-zero vector X (after centered-log-ratio transformation) such that it follows the multivariate normal distribution with mean 0 and a singular variance-covariance matrix parameter Sigma. I am interested in this singular variance-covariance matrix and would like to estimate it from data.
The straightforward model below does not work because Sigma should be singular and positive-semidefinite, while Stan’s cholesky_factor_corr will produce full-rank and positive-definite correlation matrices and variance-covariance matrices. I got pathological divergences when I attempted to run the above model on my example data.
data{
int n;
sum_to_zero_vector[n] X;
}
parameters{
vector[n] log_sigma;
cholesky_factor_corr[n] Sigma_cholesky;
}
model{
X ~ multinormal_cholesky(rep_vector(0.0,n),diag_pre_multiply(exp(log_sigma),Sigma_cholesky));
// Priors
log_sigma ~ std_normal();
Sigma_cholesky ~ LKJ(2);
}
An intuitive solution to this problem is to omit the Gth group and specify the G-1 x G-1 sub-matrix of Sigma as as n-1 vector and n-1 Cholesky factor:
parameters{
vector[n-1] log_sigma;
cholesky_factor_corr[n-1] Sigma_cholesky;
}
model{
X[1:(n-1)] ~ multinormal_cholesky(rep_vector(0.0,n-1),diag_pre_multiply(exp(log_sigma),Sigma_cholesky));
// Priors
log_sigma ~ std_normal();
Sigma_cholesky ~ LKJ(2);
}
I got rid of the pathological divergences when I tried to run this improved model on the exmaple data.
However, when I tried to calculate the Gth component of the variances from the estimated log_sigma and Sigma_cholesky, I observed inflated variation in the posterior that is similar to the pathology observed in a trivial trivial implementation of a sum-to-zero vector (i.e., collects uncertainty from previous G-1 components). In addition, when I changed the order of groups in the data (so a different group will be the ‘Gth’ group), I got very different posterior distributions for a group’s variance depending on whether that group is the ‘Gth’ group.
For a sum-to-zero vector, its sampling behaviour can be solved with the sum_to_zero built-in variable, with the Gram-Schmidt process. Alternatively, an explicit MVN can be used as discussed here.
Because sum_to_zero variable behaves well under compositional constraints, I believe there should be a way to tailor the behavior of variance and correlation matrix under the compositional constraints.
So I have two questions:
Question 1: Is there any known way to implement a covariance matrix that is well-behaved under compositional constraints?
Question 2: Is my attempted solution a viable path?
Alternative to sampling variances and correlation matrix separately, I can sample a full-rank (n-1 in our case) variance-covariance matrix from a Wishart distribution in Stan, (though not as preferred and efficient as LKJ). It seems that when I set the parameter S for Wishart distribution as a n-1 correlation matrix with all its off-diagonal elements are -1/(n-1), the prior likelihood remains the same regardless of my choice of the ‘Gth’ group. I have yet to implement this model and try it on my example data and check its performances.
Any help will be much appreciated!


