I would like to ask how (if possible) to implement the following covariance parametrization in Stan
Let S be a 3x3 covariance matrix such that S = BKKB'
where
B = [1 0, 0
b[2,1], 1, 0
b[3,1], b[3,2], 1]
K = [k[1], 0, 0
0, k[2], 0
0, 0, k[3]]
for all b,k > 0
Yes. That is actually the preferred way of setting up covariance matrices in Stan. B is the Cholesky factor and K is de vector of standard deviations if you have a multivariate normal with covariance S = BKKB’. The Choleksy factorization comes back in a number of places in the Stan guide. Here is the first mention: https://mc-stan.org/docs/2_18/stan-users-guide/multivariate-hierarchical-priors-section.html
I understand that typically covariance matrices are parametrised as DL(DL)’ where D is diagonal of standard deviations and L is the cholesky factor of correlation matrices.
I think it would be clearer if I seperate my original question into 2 parts:
Is it possible to directly parametrise covariance matrix in LDLT form where Sigma = LDL’, L is a lower unit triangular matrix and D is a diagonal matrix?
If 1 is possible, is there a way to restrict all lower triangular entry of L to be positive (i.e. same definition as B in my original question)? If 1 is not possible, is there a way to sample correlation matrix with positive only pair-wise correlation coefficient?
I should have paid attention to the details! I think you will need some math to help you out with this.
By parametrise do you mean filling out the elements of B and K and maybe making them dependent on observed data or other parameters? For the standard deviations that is relatively easy you can do something like k[1] = exp(f(data, parameters).
For B, it’s harder. The biggest problem is that the eventual covariance matrix needs to be positive definite which leads to non-linear restrictions on b_{12}, b_{13}, b_{23}. It might be possible to write out all restrictions to ensure positive definiteness and positive correlation but it’s going to be unwieldy for large correlation matrices.
I would try to reformulate your problem in terms of linear regressions if that is possible.
For instance, does it make sense to write the following.
with a21, a31, a32 > 0 is equivalent to your original description. The b’s in your formulation will have the same sign as the a’s in my formulation.
Or you could make use of the fact that if R is a correlation matrix then R^{-1} has the partial correlations scaled by a positive factor as off diagonal elements (linear regression coefficients are also scaled partial correlations).