I am mapping a set of unconstrained values to a correlation matrix and working off 10.9 Correlation matrices | Stan Reference Manual in the manual (for which I am very grateful, as the LKJ papers are not at all easy to understand). However:
- The second formula for w_{i, j} doesn’t seem correct, as it doesn’t return a Cholesky factor whose crossproduct is a correlation matrix.
- I wasn’t sure if I had made a programming error, so I checked it against the first formula which does result in a correlation matrix.
- In my attempt to figure out why the first formula wasn’t working for me, I also wrote down the actual transform that the Stan math library implements: math/cholesky_corr_constrain.hpp at 1d0b4a08b4a800a3e8d2c1a3559fbe9390b7584e · stan-dev/math · GitHub, which also results in a correlation matrix, but seems to be using a slightly different method from the one in the docs. I can see that the Stan math one is written to avoid an extra loop. But because the LKJ transformation is a bijection I think the fact it returns a different correlation matrix implies it is using a different method?.
Here is some code to demonstrate;
library(magrittr)
library(matrixcalc)
make_upper_diag_matrix <- function(x_compact, k) {
z_mat <- matrix(data = 0, nrow = k, ncol = k)
vec_counter <- 1
for (ii in 1 : (k - 1)) {
for (jj in (ii + 1) : k) {
z_mat[ii, jj] <- x_compact[vec_counter]
vec_counter <- vec_counter + 1
}
}
return(z_mat)
}
# cholesky __upper__ triangular
# using the first formula for w_{i, j} from the manual
make_chol_from_cpc_first_formula <- function(z_cpc) {
k <- ncol(z_cpc)
w <- matrix(data = 0, nrow = k, ncol = k)
w[1, 1] <- 1
w[1, 2 : k] <- z_cpc[1, 2 : k]
for (ii in 2 : k) {
w[ii, ii] <- prod(sqrt(1 - z_cpc[1 : (ii - 1), ii]^2))
if (ii == k) { next }
for (jj in (ii + 1) : k) {
w[ii, jj] <- z_cpc[ii, jj] * prod(sqrt(1 - z_cpc[1 : (ii - 1), jj]^2))
}
}
return(crossprod(w))
}
# cholesky still __upper__ triangular
# using the second formula for w_{i, j} from the manual
make_chol_from_cpc_second_formula <- function(z_cpc) {
k <- ncol(z_cpc)
w <- matrix(data = 0, nrow = k, ncol = k)
w[1, 1] <- 1
w[1, 2 : k] <- z_cpc[1, 2 : k]
for (ii in 2 : k) {
for (jj in ii : k) {
term_one <- z_cpc[ii, jj] * w[ii - 1, jj]
term_two <- sqrt(1 - (z_cpc[ii - 1, jj])^2)
w[ii, jj] <- term_one * term_two
}
}
return(crossprod(w))
}
# this is the math/rev/fun/cholesky_corr_constrain.hpp way of doing it
# note that the cholesky factor is now __lower__ triangular !!
make_corr_from_compact_math_lib <- function(x_compact, k) {
w <- matrix(data = 0, nrow = k, ncol = k)
counter <- 1
w[1, 1] <- 1
for (ii in 2 : k) {
w[ii, 1] <- x_compact[counter]
counter <- counter + 1
temp_sum_square <- w[ii, 1]^2
for (jj in 2 : ii) {
if (jj == ii) { next }
w[ii, jj] = x_compact[counter] * sqrt(1 - temp_sum_square)
counter <- counter + 1
temp_sum_square <- temp_sum_square + w[ii, jj]^2
}
w[ii, ii] <- sqrt(1 - temp_sum_square)
}
return(tcrossprod(w))
}
k <- 5 # dim of correlation matrix (k \times k)
n_unconstrained_pars <- choose(k, 2)
x_unconstrained <- rnorm(n = n_unconstrained_pars, mean = 0, sd = 0.5)
# first formulae
test_corr_first_formula <- x_unconstrained %>%
tanh() %>%
make_upper_diag_matrix(k = k) %>%
make_chol_from_cpc_first_formula()
# second formuale
test_corr_second_formula <- x_unconstrained %>%
tanh() %>%
make_upper_diag_matrix(k = k) %>%
make_chol_from_cpc_second_formula()
test_corr_math_lib <- x_unconstrained %>%
tanh() %>%
make_corr_from_compact_math_lib(k = k)
matrixcalc::is.positive.definite(test_corr_first_formula)
#> [1] TRUE
matrixcalc::is.positive.definite(test_corr_second_formula)
#> [1] FALSE
matrixcalc::is.positive.definite(test_corr_math_lib)
#> [1] TRUE
# first formula in docs -- is a correlation matrix
test_corr_first_formula
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1.0000000 -0.20286146 -0.57156605 0.12911546 -0.3833705
#> [2,] -0.2028615 1.00000000 0.03740415 0.09473022 -0.3934850
#> [3,] -0.5715660 0.03740415 1.00000000 -0.01102097 0.4231851
#> [4,] 0.1291155 0.09473022 -0.01102097 1.00000000 0.1251608
#> [5,] -0.3833705 -0.39348503 0.42318513 0.12516076 1.0000000
# second formula in docs -- is NOT a correlation matrix
test_corr_second_formula
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1.0000000 -0.20286146 -0.57156605 0.12911546 -0.38337051
#> [2,] -0.2028615 0.04115277 0.11594872 -0.02619255 0.07777110
#> [3,] -0.5715660 0.11594872 0.32878966 -0.07306701 0.22758037
#> [4,] 0.1291155 -0.02619255 -0.07306701 0.01692708 -0.04649934
#> [5,] -0.3833705 0.07777110 0.22758037 -0.04649934 0.18278035
# math lib implementation -- is a correlation matrix, but not the same as first
# formula
test_corr_math_lib
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1.0000000 -0.20286146 -0.5715660 -0.38337051 -0.5210758
#> [2,] -0.2028615 1.00000000 0.2196923 -0.01063644 0.1813084
#> [3,] -0.5715660 0.21969230 1.0000000 0.30269660 0.4837809
#> [4,] -0.3833705 -0.01063644 0.3026966 1.00000000 0.4346408
#> [5,] -0.5210758 0.18130844 0.4837809 0.43464084 1.0000000
identical(as.numeric(test_corr_first_formula), as.numeric(test_corr_math_lib))
#> [1] FALSE
Created on 2021-10-01 by the reprex package (v2.0.1)
My questions are:
- Can anyone figure out why the two formula for w_{i, j} in the docs disagree? I cannot see why/where the error creeps in.
- What transform is the math library actually implementing? The D-vine based idea of Joe in Generating random correlation matrices based on partial correlations - ScienceDirect (the predecessor of the LKJ transform)?