I have a C++ function (using Rcpp) which calculates a vector (specifically the gradient vector of a log-posterior function) and wish to compute the Hessian matrix. Before doing this manually I want to do it with autodiff.
To compute the gradients using Stan math autodiff, the code (at the end of the function) is:
///// compute gradients using Stan math library
std::vector<stan::math::var> theta_grad;
for (int i = 0; i < n_params; i++) {
theta_grad.push_back(theta_var_std[i]);
// Note: theta_var_std is the parameter vector - a std::vector of stan::math::var's
}
std::vector<double> gradient_std_vec; // store gradient here
log_prob_out.grad(theta_grad, gradient_std_vec);
stan::math::recover_memory();
Could somebody please tell me how I can extend the code above but for a vector-valued (rather than scalar-valued) function, and to output a Jacobian matrix (in my case a Hessian) matrix rather than a vector output?
Note that, eventhough i’m calculating a Hessian, it is still only first-order autodiff needed as I have the vector-valued function which computes the gradients manually coded too. So I assume the code I need shouldnt be too different than the above? I’m just not sure how to extend it for a matrix (Jacobian) output.
Also - I forgot to say that currently (as a first step) i’m only interested in calculating the * diagonal * of the Jacobian - not sure if this makes things any easier (or less computationally intense) from an autodiff standpoint?
OK so I have since tried to get the following to get the first-order derivatives (only diagonal of Hessian needed, which as I mentioned is still using first-order reverse-mode autodiff as I have the gradient function hard-coded):
for (int i = 0; i < n_params; i++) {
std::vector<stan::math::var> gradient_coeffs_var_temp;
gradient_coeffs_var_temp.push_back(theta_var_std[i]);
std::vector<double> gradient_coeffs_std_temp;
beta_grad_vec(i).grad(gradient_coeffs_var_temp, gradient_coeffs_std_temp);
stan::math::recover_memory();
out_mat(i, 2) = gradient_coeffs_std_temp[0]; // store output here
}
However - only the first one (i.e. i = 0) is correct - the rest are way off. It seems like “grad” doesnt like to be called more than once / recursively - as if I start with i = 1 then only that one is correct, etc - is there a way to call “grad” recursvely like this for a vector function?
Thanks @andrjohns@spinkney, I found the page on Jacobians with the link @andrjohns sent me (here) which is more relvant since i’ve got the gradient function manually coded. Will try and follow this and see if I can get it to work!
By the way - would it save any computation if I (somehow) only tell AD to work out the diagonal of the Jacobian as opposed to the entire matrix? if manually coding Jacobians (which I will do soon, just wanted to test things using AD first) then only computing the diagonal (often) can save a fair bit of computation - does it make any difference with AD?