Incorrect formula for correlation matrix transform

hhau · October 1, 2021, 11:23am

I am mapping a set of unconstrained values to a correlation matrix and working off 10.9 Correlation matrices | Stan Reference Manual in the manual (for which I am very grateful, as the LKJ papers are not at all easy to understand). However:

The second formula for w_{i, j} doesn’t seem correct, as it doesn’t return a Cholesky factor whose crossproduct is a correlation matrix.
I wasn’t sure if I had made a programming error, so I checked it against the first formula which does result in a correlation matrix.
In my attempt to figure out why the first formula wasn’t working for me, I also wrote down the actual transform that the Stan math library implements: math/cholesky_corr_constrain.hpp at 1d0b4a08b4a800a3e8d2c1a3559fbe9390b7584e · stan-dev/math · GitHub, which also results in a correlation matrix, but seems to be using a slightly different method from the one in the docs. I can see that the Stan math one is written to avoid an extra loop. But because the LKJ transformation is a bijection I think the fact it returns a different correlation matrix implies it is using a different method?.

Here is some code to demonstrate;

library(magrittr)
library(matrixcalc)

make_upper_diag_matrix <- function(x_compact, k) {
  z_mat <- matrix(data = 0, nrow = k, ncol = k)
  vec_counter <- 1
  
  for (ii in 1 : (k - 1)) {
    for (jj in (ii + 1) : k) {
      z_mat[ii, jj] <- x_compact[vec_counter]
      vec_counter <- vec_counter + 1
    }
  }
  
  return(z_mat)
}

# cholesky __upper__ triangular
# using the first formula for w_{i, j} from the manual
make_chol_from_cpc_first_formula <- function(z_cpc) {
  k <- ncol(z_cpc)
  w <- matrix(data = 0, nrow = k, ncol = k)
  
  w[1, 1] <- 1
  w[1, 2 : k] <- z_cpc[1, 2 : k]
  
  for (ii in 2 : k) {
    w[ii, ii] <- prod(sqrt(1 - z_cpc[1 : (ii - 1), ii]^2))
    if (ii == k) { next }
    
    for (jj in (ii + 1) : k) {
      w[ii, jj] <- z_cpc[ii, jj] * prod(sqrt(1 - z_cpc[1 : (ii - 1), jj]^2))
    }
  }
  
  return(crossprod(w))
}

# cholesky still  __upper__ triangular
# using the second formula for w_{i, j} from the manual
make_chol_from_cpc_second_formula <- function(z_cpc) {
  k <- ncol(z_cpc)
  w <- matrix(data = 0, nrow = k, ncol = k)
  
  w[1, 1] <- 1
  w[1, 2 : k] <- z_cpc[1, 2 : k]
  
  for (ii in 2 : k) {
    for (jj in ii : k) {
      term_one <- z_cpc[ii, jj] * w[ii - 1, jj]
      term_two <- sqrt(1 - (z_cpc[ii - 1, jj])^2)
      w[ii, jj] <- term_one * term_two
    }
  }
  
  return(crossprod(w))
}

# this is the math/rev/fun/cholesky_corr_constrain.hpp way of doing it
# note that the cholesky factor is now __lower__ triangular !!
make_corr_from_compact_math_lib <- function(x_compact, k) {
  w <- matrix(data = 0, nrow = k, ncol = k)
  counter <- 1
  w[1, 1] <- 1
  
  for (ii in 2 : k) {
    w[ii, 1] <- x_compact[counter]
    counter <- counter + 1
    temp_sum_square <- w[ii, 1]^2
    
    for (jj in 2 : ii) {
      if (jj == ii) { next }
      
      w[ii, jj] = x_compact[counter] * sqrt(1 - temp_sum_square)
      counter <- counter + 1
      temp_sum_square <- temp_sum_square + w[ii, jj]^2
    }
    
    w[ii, ii] <- sqrt(1 - temp_sum_square)
  }
  
  return(tcrossprod(w))
}
  
k <- 5 # dim of correlation matrix (k \times k)
n_unconstrained_pars <- choose(k, 2)
x_unconstrained <- rnorm(n = n_unconstrained_pars, mean = 0, sd = 0.5)

# first formulae
test_corr_first_formula <- x_unconstrained %>% 
  tanh() %>% 
  make_upper_diag_matrix(k = k) %>% 
  make_chol_from_cpc_first_formula()

# second formuale
test_corr_second_formula <-  x_unconstrained %>% 
  tanh() %>% 
  make_upper_diag_matrix(k = k) %>% 
  make_chol_from_cpc_second_formula()

test_corr_math_lib <- x_unconstrained %>% 
  tanh() %>% 
  make_corr_from_compact_math_lib(k = k)

matrixcalc::is.positive.definite(test_corr_first_formula)
#> [1] TRUE
matrixcalc::is.positive.definite(test_corr_second_formula) 
#> [1] FALSE
matrixcalc::is.positive.definite(test_corr_math_lib)
#> [1] TRUE

# first formula in docs -- is a correlation matrix
test_corr_first_formula
#>            [,1]        [,2]        [,3]        [,4]       [,5]
#> [1,]  1.0000000 -0.20286146 -0.57156605  0.12911546 -0.3833705
#> [2,] -0.2028615  1.00000000  0.03740415  0.09473022 -0.3934850
#> [3,] -0.5715660  0.03740415  1.00000000 -0.01102097  0.4231851
#> [4,]  0.1291155  0.09473022 -0.01102097  1.00000000  0.1251608
#> [5,] -0.3833705 -0.39348503  0.42318513  0.12516076  1.0000000

# second formula in docs -- is NOT a correlation matrix
test_corr_second_formula
#>            [,1]        [,2]        [,3]        [,4]        [,5]
#> [1,]  1.0000000 -0.20286146 -0.57156605  0.12911546 -0.38337051
#> [2,] -0.2028615  0.04115277  0.11594872 -0.02619255  0.07777110
#> [3,] -0.5715660  0.11594872  0.32878966 -0.07306701  0.22758037
#> [4,]  0.1291155 -0.02619255 -0.07306701  0.01692708 -0.04649934
#> [5,] -0.3833705  0.07777110  0.22758037 -0.04649934  0.18278035

# math lib implementation -- is a correlation matrix, but not the same as first
# formula
test_corr_math_lib
#>            [,1]        [,2]       [,3]        [,4]       [,5]
#> [1,]  1.0000000 -0.20286146 -0.5715660 -0.38337051 -0.5210758
#> [2,] -0.2028615  1.00000000  0.2196923 -0.01063644  0.1813084
#> [3,] -0.5715660  0.21969230  1.0000000  0.30269660  0.4837809
#> [4,] -0.3833705 -0.01063644  0.3026966  1.00000000  0.4346408
#> [5,] -0.5210758  0.18130844  0.4837809  0.43464084  1.0000000

identical(as.numeric(test_corr_first_formula), as.numeric(test_corr_math_lib))
#> [1] FALSE

^{Created on 2021-10-01 by the reprex package (v2.0.1)}

My questions are:

Can anyone figure out why the two formula for w_{i, j} in the docs disagree? I cannot see why/where the error creeps in.
What transform is the math library actually implementing? The D-vine based idea of Joe in Generating random correlation matrices based on partial correlations - ScienceDirect (the predecessor of the LKJ transform)?

spinkney · October 1, 2021, 12:54pm

I don’t have time to dive into this today but I remember that there was a question like this…

Does this thread answer your question?

hhau · October 1, 2021, 1:12pm

Indeed that thread is the same question as I have, and I even remember reading it now you’ve linked it. That will teach me to search better. Happy to have this merged in under that thread.

The second last post gets at the difference between the math library and the first formula. I’ll see what I can figure out.

spinkney · October 1, 2021, 1:13pm

Looks like the docs never got fixed. Do you want to take a crack at fixing it?

hhau · October 4, 2021, 8:19am

Hoping to get some more time on this topic this week – If I figure out what the doc should be / what is actually going on with the code & math then I’ll send a docs PR.

CerulloE · March 2, 2023, 6:50pm

Hey @hhau sorry to bump an old topic but i’m wondering do you know where the code for the Jacobian adjustment for the math/rev/fun/cholesky_corr_constrain.hpp is? (or know what it is?)

hhau · March 3, 2023, 7:23am

It is the final formula of https://mc-stan.org/docs/reference-manual/cholesky-factors-of-correlation-matrices-1.html.

In the code the jacobian terms directly modify the lp variable. This happens in two places for this transformation, the first being: https://github.com/stan-dev/math/blob/develop/stan/math/rev/fun/cholesky_corr_constrain.hpp#L116 and line 124, which implements the second half of the formula in the docs. The second is https://github.com/stan-dev/math/blob/develop/stan/math/prim/fun/corr_constrain.hpp#L48, which happens when corr_constrain(y, lp) is called (which is tanh in disguise), and implements the first term in the docs.

Note that the docs use \frac{\text{d}}{\text{d}y} \text{tanh}(y) = \frac{1}{(\text{cosh}(y))^2} but the code uses \frac{\text{d}}{\text{d}y} \text{tanh}(y) = 1- (\text{tanh}(y))^2, but fortunately wikipedia tells us that these are the same: https://en.wikipedia.org/wiki/Hyperbolic_functions#Derivatives

Topic		Replies	Views
LKJ inverse transform for correlation matrices (incorrect documentation?) Developers	15	1165	November 23, 2020
Possible Error in the Correlation Matrices (10.9) Section of the Stan Manual General	5	647	December 6, 2022
How does Stan sample correlation matrices? General	7	659	March 2, 2023
LKJ in stan function reference General	4	417	May 4, 2020
Question on cholesky_factor_corr and the resulting correlation matrix Modeling	4	1098	January 8, 2020

Incorrect formula for correlation matrix transform

Related topics