Ldlt factorizations

maintenance
math

#1

@bgoodri @syclik

I’ve been digging around in the LDLT decompositions in stan/math for this (https://github.com/stan-dev/math/issues/565) issue (pull req is sitting here: https://github.com/stan-dev/math/pull/927).

It seems like they’re used in a bunch of places where matrices should be positive definite. In that case, why aren’t we using LLTs? Going by the Eigen tables, this is desirable: https://eigen.tuxfamily.org/dox/group__TutorialLinearAlgebra.html

I guess the LDLT is going to keep us from having to add small values to the diagonals of all the matrices it processes, but the LLTs are faster.

What do you think about switching to LLT and getting rid of the LDLT stuff?

I’m just checking because I’m going to be doing a little gymnastics to get the LDLT variables allocated properly anyway to solve the original issue at the top of this thread. LLT would be easier to work around because it doesn’t need to save any row transpositions (which is what makes it less stable and faster)

Here’s a summary of where I think the LDLT is being used:

grep -r "_ldlt" ../math-565/stan/math/prim/mat/prob
../math-565/stan/math/prim/mat/prob/matrix_normal_prec_lpdf.hpp:#include <stan/math/prim/mat/err/check_ldlt_factor.hpp>
../math-565/stan/math/prim/mat/prob/matrix_normal_prec_lpdf.hpp:#include <stan/math/prim/mat/fun/log_determinant_ldlt.hpp>
../math-565/stan/math/prim/mat/prob/matrix_normal_prec_lpdf.hpp:  check_ldlt_factor(function, "LDLT_Factor of Sigma", ldlt_Sigma);
../math-565/stan/math/prim/mat/prob/matrix_normal_prec_lpdf.hpp:  check_ldlt_factor(function, "LDLT_Factor of D", ldlt_D);
../math-565/stan/math/prim/mat/prob/matrix_normal_prec_lpdf.hpp:    lp += log_determinant_ldlt(ldlt_Sigma) * (0.5 * y.rows());
../math-565/stan/math/prim/mat/prob/matrix_normal_prec_lpdf.hpp:    lp += log_determinant_ldlt(ldlt_D) * (0.5 * y.cols());
../math-565/stan/math/prim/mat/prob/multi_student_t_lpdf.hpp:#include <stan/math/prim/mat/err/check_ldlt_factor.hpp>
../math-565/stan/math/prim/mat/prob/multi_student_t_lpdf.hpp:  check_ldlt_factor(function, "LDLT_Factor of scale parameter", ldlt_Sigma);
../math-565/stan/math/prim/mat/prob/multi_student_t_lpdf.hpp:    lp -= 0.5 * log_determinant_ldlt(ldlt_Sigma) * size_vec;
../math-565/stan/math/prim/mat/prob/multi_student_t_lpdf.hpp:          += log1p(trace_inv_quad_form_ldlt(ldlt_Sigma, y_minus_mu) / nu);
../math-565/stan/math/prim/mat/prob/wishart_lpdf.hpp:#include <stan/math/prim/mat/err/check_ldlt_factor.hpp>
../math-565/stan/math/prim/mat/prob/wishart_lpdf.hpp:#include <stan/math/prim/mat/fun/log_determinant_ldlt.hpp>
../math-565/stan/math/prim/mat/prob/wishart_lpdf.hpp:#include <stan/math/prim/mat/fun/mdivide_left_ldlt.hpp>
../math-565/stan/math/prim/mat/prob/wishart_lpdf.hpp:  check_ldlt_factor(function, "LDLT_Factor of random variable", ldlt_W);
../math-565/stan/math/prim/mat/prob/wishart_lpdf.hpp:  check_ldlt_factor(function, "LDLT_Factor of scale parameter", ldlt_S);
../math-565/stan/math/prim/mat/prob/wishart_lpdf.hpp:    lp -= 0.5 * nu * log_determinant_ldlt(ldlt_S);
../math-565/stan/math/prim/mat/prob/wishart_lpdf.hpp:        mdivide_left_ldlt(ldlt_S, static_cast<Matrix<T_y, Dynamic, Dynamic> >(
../math-565/stan/math/prim/mat/prob/wishart_lpdf.hpp:    lp += 0.5 * (nu - k - 1.0) * log_determinant_ldlt(ldlt_W);
../math-565/stan/math/prim/mat/prob/multi_gp_lpdf.hpp:#include <stan/math/prim/mat/err/check_ldlt_factor.hpp>
../math-565/stan/math/prim/mat/prob/multi_gp_lpdf.hpp:#include <stan/math/prim/mat/fun/log_determinant_ldlt.hpp>
../math-565/stan/math/prim/mat/prob/multi_gp_lpdf.hpp:#include <stan/math/prim/mat/fun/trace_gen_inv_quad_form_ldlt.hpp>
../math-565/stan/math/prim/mat/prob/multi_gp_lpdf.hpp:  check_ldlt_factor(function, "LDLT_Factor of Sigma", ldlt_Sigma);
../math-565/stan/math/prim/mat/prob/multi_gp_lpdf.hpp:    lp -= 0.5 * log_determinant_ldlt(ldlt_Sigma) * y.rows();
../math-565/stan/math/prim/mat/prob/multi_gp_lpdf.hpp:    lp -= 0.5 * trace_gen_inv_quad_form_ldlt(w_mat, ldlt_Sigma, yT);
../math-565/stan/math/prim/mat/prob/inv_wishart_lpdf.hpp:#include <stan/math/prim/mat/err/check_ldlt_factor.hpp>
../math-565/stan/math/prim/mat/prob/inv_wishart_lpdf.hpp:#include <stan/math/prim/mat/fun/log_determinant_ldlt.hpp>
../math-565/stan/math/prim/mat/prob/inv_wishart_lpdf.hpp:#include <stan/math/prim/mat/fun/mdivide_left_ldlt.hpp>
../math-565/stan/math/prim/mat/prob/inv_wishart_lpdf.hpp:  check_ldlt_factor(function, "LDLT_Factor of random variable", ldlt_W);
../math-565/stan/math/prim/mat/prob/inv_wishart_lpdf.hpp:  check_ldlt_factor(function, "LDLT_Factor of scale parameter", ldlt_S);
../math-565/stan/math/prim/mat/prob/inv_wishart_lpdf.hpp:    lp += 0.5 * nu * log_determinant_ldlt(ldlt_S);
../math-565/stan/math/prim/mat/prob/inv_wishart_lpdf.hpp:    lp -= 0.5 * (nu + k + 1.0) * log_determinant_ldlt(ldlt_W);
../math-565/stan/math/prim/mat/prob/inv_wishart_lpdf.hpp:        Winv_S(mdivide_left_ldlt(
../math-565/stan/math/prim/mat/prob/multi_normal_prec_lpdf.hpp:#include <stan/math/prim/mat/err/check_ldlt_factor.hpp>
../math-565/stan/math/prim/mat/prob/multi_normal_prec_lpdf.hpp:#include <stan/math/prim/mat/fun/log_determinant_ldlt.hpp>
../math-565/stan/math/prim/mat/prob/multi_normal_prec_lpdf.hpp:  check_ldlt_factor(function, "LDLT_Factor of precision parameter", ldlt_Sigma);
../math-565/stan/math/prim/mat/prob/multi_normal_prec_lpdf.hpp:    lp += 0.5 * log_determinant_ldlt(ldlt_Sigma) * size_vec;
../math-565/stan/math/prim/mat/prob/multi_normal_rng.hpp:#include <stan/math/prim/mat/fun/trace_inv_quad_form_ldlt.hpp>
../math-565/stan/math/prim/mat/prob/multi_normal_rng.hpp:#include <stan/math/prim/mat/fun/log_determinant_ldlt.hpp>
../math-565/stan/math/prim/mat/prob/multi_student_t_rng.hpp:#include <stan/math/prim/mat/err/check_ldlt_factor.hpp>
../math-565/stan/math/prim/mat/prob/multi_normal_lpdf.hpp:#include <stan/math/prim/mat/err/check_ldlt_factor.hpp>
../math-565/stan/math/prim/mat/prob/multi_normal_lpdf.hpp:#include <stan/math/prim/mat/fun/trace_inv_quad_form_ldlt.hpp>
../math-565/stan/math/prim/mat/prob/multi_normal_lpdf.hpp:#include <stan/math/prim/mat/fun/log_determinant_ldlt.hpp>
../math-565/stan/math/prim/mat/prob/multi_normal_lpdf.hpp:  check_ldlt_factor(function, "LDLT_Factor of covariance parameter",
../math-565/stan/math/prim/mat/prob/multi_normal_lpdf.hpp:    lp -= 0.5 * log_determinant_ldlt(ldlt_Sigma) * size_vec;
../math-565/stan/math/prim/mat/prob/multi_normal_lpdf.hpp:      sum_lp_vec += trace_inv_quad_form_ldlt(ldlt_Sigma, y_minus_mu);

#2

Go for it! The reason is that it’s work. It takes work to implement. More work to test. Even more to maintain. But… it’s just work. If you have time, go and do it.

That said, I’d say time it to show that it is faster before doing a lot of work.


#3

This would include getting rid of a few of _ldlt functions, is that fine as long as they’re only used internally in the Math library?

The list is something like

mdivide_left_ldlt
trace_inv_quad_form_ldlt
trace_gen_inv_quad_form_ldlt
log_determinant_ldlt

#4

I think by “large”, Eigen means matrices that we would soon hand off to the GPU. The LDLt decomposition is more numerically accurate.


#5

Eh, it’s all the way up if these tables are to be trusted: https://eigen.tuxfamily.org/dox/group__DenseDecompositionBenchmark.html

The speed gains are pretty marginal at the bottom though, I agree. I’m more interested in avoiding the memory gymnastics to avoid using the C++ heap for LDLT.


#6

@syclik I went and talked to @bgoodri about this. He recommended keeping ldlt for the extra robustness, so I’ll try to work around my issue.


#7

Ok. Thanks for keeping us posted.

You’re more than welcome to add a different implementation if you think it’ll work.