Hi Alejandro!
There’s code I wrote for the embedded Laplace approximation that does exactly what you’re looking for; in particular see this file, lines 168 - 196 and 252. Note that this code lives on an experimental branch of Stan, so you won’t find it in any released version (yet).
You’re right about using Jacobian()
. Unfortunately, this (higher-order) function places constraints on the signature of f, the function which gets differentiated. Specifically, f needs to take in and return a vector, and Jacobian()
then computes the full Jacobian \partial f_i / \partial x_j. In C++, f
is a structure with an operator (y)
that returns f(y), where y are the elements for which you require sensitivities / derivatives.
So you need to coerce the kernel function, K: x, y \rightarrow K(x, y) into a functor f_x: y^* \rightarrow K^*(x, y), where y^* stores the elements of y as an Eigen::Vector
, and K^* those of K.
In my code lines 168 - 196 build a structure around K:
template <typename K>
struct covariance_sensitivities {
/* input data for the covariance function. */
std::vector<Eigen::VectorXd> x_;
/* number of latent variables. */
int theta_size_;
/* structure to compute the covariance function. */
K covariance_function_;
covariance_sensitivities (const std::vector<Eigen::VectorXd>& x,
int theta_size,
const K& covariance_function) :
x_(x), theta_size_(theta_size),
covariance_function_(covariance_function) { }
template <typename T>
Eigen::Matrix<T, Eigen::Dynamic, 1>
operator() (const Eigen::Matrix<T, Eigen::Dynamic, 1>& phi) const {
return to_vector(covariance_function_(phi, x_, theta_size_));
}
};
and line 252 calls Jacobian
:
jacobian_fwd(f, value_of(phi), covariance_vector, diff_cov);
Note that Jacobian
exists both under the rev
and fwd
directories, respectively to do reverse and forward mode derivatives. Which mode to use depends on your map f: \mathbb R^n \rightarrow \mathbb R^m, specifically the dimensions m and n. Comparing both is an easy and enlightening exercise. In this particular example, there is one autodiff method we don’t use in Stan which could be quite useful, which is taping, something worth thinking about.
To complement @Bob_Carpenter’s reference on autodiff, see my review on autodiff.
EDIT: the code is a prototype, there are a lot of things that can be improved, and it’d be worth discussing how.