When will the schur decomposition be available? Also, I’m interested in the real case (as I saw the complex number post on Gelman’s blog at what can we do with complex numbers in stan.

I’m interested in writing the “univariate approach to multivariate case” Kalman filter (reference https://personal.vu.nl/s.j.koopman/old/publications/DKFastFiltering.pdf). In that paper they say that if the variance matrix H is not diagonal then one can use the Schur decomposition to get a new diagonal matrix and weight the observations with the Schur orthogonal matrices (see pages 5-6 of the pdf).

Related to Schur decomposition

Not for the near future. But that sort of thing is about the easiest case for defining your own C++ implementation of a Stan function you declare in the functions block.

I don’t know C++ but I spent an hour looking at how to do this. I first just attempted to get the schur decomposition working using RcppEigen, and I did! Then I looked at the stan/math lib to see how this may work in Stan. My attempt at using the templates is under the Rcpp block.

When I read the wiki for defining c++ stuff and calling it in the functions block this doesn’t look anything like it. As it appears the template is made automatically? Anyway, wondering if you could help to get it in the format to call.

Rcpp version

#include <Rcpp.h>
#include <RcppEigen.h>
using namespace Rcpp;
using Eigen::RealSchur;
using Eigen::MatrixXd;

//[[Rcpp::depends(RcppEigen)]]
// [[Rcpp::export]]
Rcpp::List compute_schur(MatrixXd x) {
RealSchur<MatrixXd> schur(x);
return Rcpp::List::create(_["U"] = schur.matrixU(),
_["T"] = schur.matrixT());
}
//run after the compilation.
//
/*** R
A = matrix(c(13, 8, 8,
-1, 7, -2,
-1, -2, 7), nrow = 3, byrow = T)
compute_schur(A)
*/


The template to define the return of U and T from A = UTU^T. I based it off the singular_values template in Stan math:

/**
* Return the unitary matrix of the Schur
* decomposition on a real matrix. The return value
* \f$U\f$ is such that the original matrix \f$A\f$
* is given by <p>\f$A = U \times T \times U^EigMat\f$.
*
* @tparam T type of elements in the matrix
* @param m Specified matrix.
* @return U matrix.
*/
template <typename T>
Eigen::Matrix<T, Eigen::Dynamic, 1> schur_U(
const Eigen::Matrix<T, Eigen::Dynamic, Eigen::Dynamic>& m) {
if (m.size() == 0) {
return {};
}
return Eigen::RealSchur<Eigen::Matrix<T, Eigen::Dynamic, Eigen::Dynamic> >(m)
.matrixU();
}

/**
* Return the upper triangular matrix of the Schur
* decomposition on a real matrix. The return value
* \f$T\f$ is such that the original matrix \f$A\f$
* is given by <p>\f$A = U \times T \times U^EigMat\f$.
*
* @tparam T type of elements in the matrix
* @param m Specified matrix.
* @return T matrix.
*/
template <typename T>
Eigen::Matrix<T, Eigen::Dynamic, 1> schur_T(
const Eigen::Matrix<T, Eigen::Dynamic, Eigen::Dynamic>& m) {
if (m.size() == 0) {
return {};
}
return Eigen::RealSchur<Eigen::Matrix<T, Eigen::Dynamic, Eigen::Dynamic> >(m)
.matrixT();
}

2 Likes

In case it’s at all helpful, I tried (and got stuck) implementing a complex schur decomposition for a Lyapunov solver here: Solve a Lyapunov / Sylvester equation: include custom c++ function using Eigen library -- possible?

I also tried the univariate approach to multivariate filtering before, I may have just gotten something wrong / misunderstood something, but it didn’t work equivalent to the multivariate form, which I concluded was because the likelihood was different – because in the univariate form the likelihood for indicator 3 at time 4 is conditional on the predicted state at time 4, and the observations of indicators 1 and 2 at time 4. Perhaps there is some other way to formulate it, I didn’t push too hard…

ctsem uses an extended kalman filter implemented in stan, a few examples here: https://cdriver.netlify.app/
would be happy to chat about any of this, combine some ideas / work perhaps…

1 Like

The second version from you post (as in Stan math) looks at first glance that it should almost work. With

functions {
matrix shur_T(matrix m);
}


you AFAIK only need to add , std::ostream *msgs as an additional argument (which you can ignore), so having

template <typename T>
Eigen::Matrix<T, Eigen::Dynamic, 1> schur_T(
const Eigen::Matrix<T, Eigen::Dynamic, Eigen::Dynamic>& m, std::ostream *msgs) {
....
}


should IMHO work. What is the problem you are running into?

2 Likes

If I update the file (called ‘external_test.hpp’) to

template <typename T>
Eigen::Matrix<T, Eigen::Dynamic, 1> schur_U(
const Eigen::Matrix<T, Eigen::Dynamic, Eigen::Dynamic>& m, std::ostream *msgs) {
if (m.size() == 0) {
return {};
}
return Eigen::RealSchur<Eigen::Matrix<T, Eigen::Dynamic, Eigen::Dynamic> >(m)
.matrixU();
}


And then I run the following in R

library(rstan)

stancode <- 'functions {
// matrix schur_T(matrix A);
matrix schur_U(matrix A);
}
data {
matrix[3, 3] A;
}
parameters {
real b;
}
transformed parameters {
//  matrix T = schur_T(A);
matrix[3, 3] U = schur_U(A);
}
model {
b ~ std_normal();
}
'

mod <- stan_model(model_code = stancode,
allow_undefined = TRUE,
includes = paste0('\n#include "',
file.path(getwd(), 'external_test.hpp'), '"\n')
)

try(readLines(stanc(model_code = stancode, allow_undefined = TRUE)$cppcode))  I get the error messages from stan_model(...) Error in compileCode(f, code, language = language, verbose = verbose) : Compilation ERROR, function(s)/method(s) not created! In file included from <built-in>:1: In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:4: In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1: In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:535: /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas] #pragma clang diagnostic pop ^ In file included from <built-in>:1: In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:4: In file included from /Library/Frameworks/R.framework/Versions/4. In addition: Warning message: In system(cmd, intern = !verbose) : running command '/Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB file808778475ee3.cpp 2> file808778475ee3.cpp.err.txt' had status 1 Error in sink(type = "output") : invalid connection  and try(readLines(stanc(model_code = stancode, allow_undefined = TRUE)$cppcode))

Error in file(con, "r") : cannot open the connection
In file(con, "r") :
cannot open file '// Code generated by Stan version 2.19.1

namespace model808715a6c6c6_stancode_namespace {

using std::istream;
using std::string;
using std::stringstream;
using std::vector;
using stan::io::dump;
using stan::math::lgamma;
using namespace stan::math;

static int current_statement_begin__;

}

template <typename T0__>
Eigen::Matrix<typename boost::math::tools::promote_args<T0__>::type, Eigen::Dynamic, Eigen::Dynamic>
schur_U(const Eigen::Matrix<T0__, Eigen::Dynamic, Eigen::Dynamic>& A, std::ostream* pstream__);

class model808715a6c6c6_stancode : public prob_grad {
private:
matrix_d A;
public:
model808715a6c6c6_stancode(stan::io::var_context& context__,
[... truncated]


Thanks for the links! I’ve tried some things that were mentioned in them but to no avail.

As for the univariate approach…in the KFAS R package (@helske ) paper (https://www.jstatsoft.org/article/view/v078i10) on page 3 he lists the assumptions, that a_{1,1} \sim N(a_1, P_1) and that H_t is diagonal for all t. Because of that assumption on H I’m attempting this schur decomp. But @helske also says that we can use the LDL decomposition. So I guess I can use the cholesky implementation and build up L^\text{LDL} from L^\text{Cholesky}.

As for possible collaboration, let me explain a bit about my model. I’m just trying to build up some multivariate time varying mean with trend and seasonal component models. But with the requirement that they be fast (:D) which why I’m attempting the kalman filter. I’ve played around with partial pooling of the trend and seasonal components as my data have J countries, S sites in each country, P platforms and D demographic groups, so it gets quite large and unwieldy.

This appears to be primarily an issue with your installation. On my system, I can get the code to compile (after a few adjustments I missed originally), here’s what my .hpp looks like:


template <typename T>
Eigen::Matrix<typename boost::math::tools::promote_args<T>::type, Eigen::Dynamic, Eigen::Dynamic>
schur_U(const Eigen::Matrix<T, Eigen::Dynamic, Eigen::Dynamic>& m, std::ostream* pstream__)
{
if (m.size() == 0) {
return {};
}
return Eigen::RealSchur<Eigen::Matrix<T, Eigen::Dynamic, Eigen::Dynamic> >(m)
.matrixU();
}


Although the function signature might look like magic, I’ve just taken the signature generated by Stan for the function and copied it, so no advanced knowledge here :-)

I am not very good at debugging toolchain issues (and I am on Windows), so I can’t help you much (maybe start a different topic, if you can’t figure it out)

2 Likes

I was testing on a mac (personal computer I use when I have my morning coffee). I get a bucket load of warnings even when I compile in Rcpp but the function works. In Stan it just errors out.

I’ll test on my windows machine through work shortly and report back. If that fails I’ll start a new post.

Thanks for testing, at least now I know it should work :)

It compiles and runs, yah (on R 3.6.3, had to switch from 4.0 as I’m getting the cannot allocate vector of size huge_number gb error)

> fit <- sampling(mod, data = list(A = matrix(c(13, 8, 8,
+                                               -1, 7, -2,
+                                               -1, -2, 7), nrow = 3, byrow = T) ))

> fit
Inference for Stan model: d655d4f6507e798ab50bec62e7aa52e0.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.

mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
b      -0.02    0.02 0.97 -1.89 -0.69 -0.03  0.63  1.90  1709 1.00
U[1,1]  0.94    0.00 0.00  0.94  0.94  0.94  0.94  0.94     2 1.00
U[1,2] -0.33    0.00 0.00 -0.33 -0.33 -0.33 -0.33 -0.33     2 1.00
U[1,3]  0.00     NaN 0.00  0.00  0.00  0.00  0.00  0.00   NaN  NaN
U[2,1] -0.24    0.00 0.00 -0.24 -0.24 -0.24 -0.24 -0.24     2 1.00
U[2,2] -0.67    0.00 0.00 -0.67 -0.67 -0.67 -0.67 -0.67     2 1.00
U[2,3] -0.71    0.00 0.00 -0.71 -0.71 -0.71 -0.71 -0.71     2 1.00
U[3,1] -0.24    0.00 0.00 -0.24 -0.24 -0.24 -0.24 -0.24     2 1.00
U[3,2] -0.67    0.00 0.00 -0.67 -0.67 -0.67 -0.67 -0.67     2 1.00
U[3,3]  0.71    0.00 0.00  0.71  0.71  0.71  0.71  0.71     2 1.00
lp__   -0.47    0.02 0.67 -2.34 -0.62 -0.22 -0.05  0.00  1583 1.01

Samples were drawn using NUTS(diag_e) at Tue May 05 12:18:25 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

1 Like

Yes you can just use LDL or Cholesky decomposition instead of Schur, even D&K don’t mention Schur outside that one paper (their notation etc is a bit tricky sometimes as it changes between papers and book).

But in order to really speed up the computations, it is likely more efficient to directly parameterize your covariance matrices via Cholesky/LDL, and then you just need to do the necessary transformations of the univariate approach instead of re-decomposing your covariance matrix at each sampling step.

2 Likes

Thanks for chiming in, that does seem like a better approach! Also, in your paper you say “If standard multivariate matrices F_t and K_t are needed for inference, they can be computed later from the results of the univariate ﬁlter” (page 4). Would you happen to have an example calculation?

Posting here in case anyone else finds it helpful, if you want both the matrices out for the Schur decomposition here’s the code.

functions {
matrix[ ] schur(matrix A);
}
...
// where A is defined as N x N then
// schur(A) returns a 2d array of N x N matrices
matrix[N, N] A;
matrix[N, N] sUT[2] = schur(A);


and have a .hpp file saved, I’ve named it external.hpp with the following inside

template <typename T>
std::vector<Eigen::Matrix<typename boost::math::tools::promote_args<T>::type, Eigen::Dynamic, Eigen::Dynamic>>
schur(const Eigen::Matrix<T, Eigen::Dynamic, Eigen::Dynamic>& m, std::ostream* pstream__)
{
if (m.size() == 0) {
return {};
}

Eigen::RealSchur<Eigen::Matrix<T, Eigen::Dynamic, Eigen::Dynamic>> schur(m);

return {schur.matrixU(), schur.matrixT()};
}


then in R you can compile with

mod <- stan_model(model_code = stancode,
allow_undefined = TRUE,
includes = paste0('\n#include "',
file.path('~/external.hpp'), '"\n')
)

3 Likes

I can’t remember anymore if there is a more clever way, but probably the most straightforward way to get the multivariate versions of F_t and K_t is by using the Kalman filtering recursions i.e.
F_t = Z_t P_t Z_t' + H_t and K_t = P_t Z_t, as the P_{p,t} (where p is the number of series) from univariate version matches the multivariate P_t.

It’ll probalby be a few releases (as in six months to a year) before we can add to the language sensibly. Also, for the Schur in particular, we haven’t worked out the autodiff for complex matrix arguments—there’s a problemw ith overpromotion of intermediate types that we need to trace through Eigen to solve.