Thanks Ben and Bob for the guidance.
Following Ben’s advice to link to Eigen, I searched a bit more and found this very helpful tutorial by @bbbales2 and colleagues. In case useful to anyone else in future, below is the C++ function I came up with and an small example Stan model. I need to do more benchmarking of how the sparse vs. dense LU decomposition scales for large matrices – I’ll update the post with that when I have a chance.
One question I had was best practice for error handling. On a couple of testing examples my R session was occasionally crashing during calling sampling()
(after successfully compiling model). It was hard to trace why, but I suspected it could be a uncaught failure of the LU decomposition. I added the following to the function, but not sure if this is the best practice.
if (lu.info() != Eigen::Success)
throw std::runtime_error("Sparse LU decomposition error");
Thanks,
Jeff
C++ function
The C++ function below accepts a triplet list encoding a square sparse matrix as arguments i
, j
, and x
, with argument dim
specifying the number of rows and columns of the matrix. The function constructs an Eigen::SparseMatrix
object, then constructs the LU decomposition and return the log absolute determinant.
cppfun <- '
template <typename T3__>
typename boost::math::tools::promote_args<T3__>::type
sparse_log_determinant(const int& dim,
const std::vector<int>& i,
const std::vector<int>& j,
const Eigen::Matrix<T3__, Eigen::Dynamic,1>& x,
std::ostream* pstream__) {
typedef Eigen::Triplet<T3__> T;
std::vector<T> tripletList;
tripletList.reserve(i.size());
for(int idx = 0; idx < i.size(); idx++)
tripletList.push_back(T(i[idx]-1, j[idx]-1, x[idx]));
Eigen::SparseMatrix<T3__> Js(dim, dim);
Js.setFromTriplets(tripletList.begin(), tripletList.end());
const Eigen::SparseLU<Eigen::SparseMatrix<T3__>> lu(Js);
if (lu.info() != Eigen::Success)
throw std::runtime_error("Sparse LU decomposition error");
return lu.logAbsDeterminant();
}
'
Calling from Stan
Save the C++ code to a file.
cppfile <- tempfile()
writeLines(cppfun, cppfile)
In the Stan model code, declare a user-defined Stan function with the function declaration only but not the function body.
stancode <- "
functions {
real sparse_log_determinant(int dim, int[] i, int[] j, vector x);
}
"
Compile the Stan model with the option allow_undefined = TRUE
to allow declaration of the function without the body and providing the C++ file code using the includes =
argument.
sm <- stan_model(model_code = stancode,
allow_undefined = TRUE,
includes = paste0('\n#include "', cppfile, '"\n'))
A couple of helpful notes:
- Bales et al. and the rstan vignette describe how to ascertain the templated C++ signature expected by Stan.
- The
#include
statement in stan_model()
requires the full path to the C++ file.
Testing the code: spherical coordinates transformation
To test that the code is doing the right thing, the following example transforms spherical to Cartesian coordinates and samples each coordinate from an independent normal distribution:
x \sim \mathrm{Normal}(0, 1) \\
y \sim \mathrm{Normal}(-1, 1.5) \\
z \sim \mathrm{Normal}(3, 0.5)
Transformed paramters x, y, and z are a functions of spherical parameters r > 0, \theta \in [0, \pi], and \phi \in [0, 2\pi):
x = r\sin\theta\cos\phi \\
y = r\sin\theta\sin\phi \\
z = r\cos\theta
The Jacobian of the transformation is:
J = \left[
\begin{matrix}
\sin\theta\cos\phi & r\cos\theta\cos\phi & -r\sin\theta\sin\phi \\
\sin\theta\sin\phi & r\cos\theta\sin\phi & r\sin\theta\cos\phi \\
\cos\theta & -r\sin\theta & 0 \\
\end{matrix}
\right]
For comparison, analytically the log absolute determinant of J is \log ||J|| = 2 \cdot \log(r) + \log(\sin\theta).
Stan model
The Stan code below implements this model.
stancode <- "
functions {
real sparse_log_determinant(int dim, int[] i, int[] j, vector x);
}
transformed data {
// indices for non-zero elements of Jacobian in triplet form
int i[8] = {1, 2, 3, 1, 2, 3, 1, 2};
int j[8] = {1, 1, 1, 2, 2, 2, 3, 3};
}
parameters {
real<lower=0> r;
real<lower=0, upper=pi()> theta;
real<lower=0, upper=2*pi()> phi;
}
transformed parameters {
// spherical -> cartesian coordinates
real x = r * sin(theta) * cos(phi);
real y = r * sin(theta) * sin(phi);
real z = r * cos(theta);
//
// non-zero Jacobian values in triplet form
vector[8] Js;
Js[1] = sin(theta) * cos(phi);
Js[2] = sin(theta) * sin(phi);
Js[3] = cos(theta);
Js[4] = r * cos(theta) * cos(phi);
Js[5] = r * cos(theta) * sin(phi);
Js[6] = -r * sin(theta);
Js[7] = -r * sin(theta) * sin(phi);
Js[8] = r * sin(theta) * cos(phi);
}
model {
target += sparse_log_determinant(3, i, j, Js);
x ~ normal(0, 1);
y ~ normal(-1, 1.5);
z ~ normal(3, 0.5);
}
generated quantities {
real logdet_sparse = sparse_log_determinant(3, i, j, Js);
real logdet_analytical = 2 * log(r) + log(sin(theta));
}
"
Compile and sample from the model:
sm <- stan_model(model_code = stancode,
allow_undefined = TRUE,
includes = paste0('\n#include "', cppfile, '"\n'))
fit <- sampling(sm, seed = 17)
Confirm that x, y, and z have the desired mean and standard deviation.
print(fit, c("x", "y", "z"));
## Inference for Stan model: 420e358b1dd4b1c9a78473067fd0b1da.
## 4 chains, each with iter=2000; warmup=1000; thin=1;
## post-warmup draws per chain=1000, total post-warmup draws=4000.
##
## mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
## x -0.01 0.02 0.99 -1.94 -0.71 0.00 0.67 1.89 2539 1
## y -1.02 0.06 1.54 -3.96 -2.09 -1.02 0.03 2.07 601 1
## z 3.00 0.01 0.51 2.00 2.66 3.01 3.34 4.00 2844 1
##
## Samples were drawn using NUTS(diag_e) at Sun Jun 9 19:37:01 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
Confirm that the log determinant calculated by Eigen::SparseLU
matches the analytical log determinant.
print(fit, c("logdet_sparse", "logdet_analytical"));
## Inference for Stan model: 420e358b1dd4b1c9a78473067fd0b1da.
## 4 chains, each with iter=2000; warmup=1000; thin=1;
## post-warmup draws per chain=1000, total post-warmup draws=4000.
##
## mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
## logdet_sparse 1.71 0.03 0.8 -0.02 1.22 1.77 2.26 3.05 676 1
## logdet_analytical 1.71 0.03 0.8 -0.02 1.22 1.77 2.26 3.05 676 1
##
## Samples were drawn using NUTS(diag_e) at Sun Jun 9 19:37:01 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
sessionInfo()
sessionInfo()
## R version 3.5.2 (2018-12-20)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14.4
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] rstan_2.18.2 StanHeaders_2.18.1 ggplot2_3.1.0.9000
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.1 pillar_1.4.0 compiler_3.5.2
## [4] plyr_1.8.4 highr_0.7 prettyunits_1.0.2
## [7] tools_3.5.2 digest_0.6.19 pkgbuild_1.0.3
## [10] evaluate_0.13 tibble_2.1.1 gtable_0.3.0
## [13] pkgconfig_2.0.2 rlang_0.3.4 cli_1.1.0
## [16] parallel_3.5.2 yaml_2.2.0 xfun_0.5
## [19] loo_2.1.0 gridExtra_2.3 withr_2.1.2
## [22] dplyr_0.8.1 stringr_1.4.0 knitr_1.22
## [25] stats4_3.5.2 grid_3.5.2 tidyselect_0.2.5
## [28] glue_1.3.1 inline_0.3.15 R6_2.4.0
## [31] processx_3.3.1 rmarkdown_1.11 purrr_0.3.2
## [34] callr_3.2.0 magrittr_1.5 codetools_0.2-16
## [37] matrixStats_0.54.0 ps_1.3.0 scales_1.0.0
## [40] htmltools_0.3.6 assertthat_0.2.1 colorspace_1.4-1
## [43] stringi_1.4.3 lazyeval_0.2.2 munsell_0.5.0
## [46] crayon_1.3.4