Derivative of vector-valued function (i.e., Jacobian) in Stan math C++ library using reverse autodiff?


I have a C++ function (using Rcpp) which calculates a vector (specifically the gradient vector of a log-posterior function) and wish to compute the Hessian matrix. Before doing this manually I want to do it with autodiff.

To compute the gradients using Stan math autodiff, the code (at the end of the function) is:

  ///// compute gradients using Stan math library
  std::vector<stan::math::var> theta_grad;
  for (int i = 0; i < n_params; i++) {
  // Note: theta_var_std is the parameter vector - a std::vector of stan::math::var's
  std::vector<double> gradient_std_vec; // store gradient here
  log_prob_out.grad(theta_grad, gradient_std_vec);

Could somebody please tell me how I can extend the code above but for a vector-valued (rather than scalar-valued) function, and to output a Jacobian matrix (in my case a Hessian) matrix rather than a vector output?

Note that, eventhough i’m calculating a Hessian, it is still only first-order autodiff needed as I have the vector-valued function which computes the gradients manually coded too. So I assume the code I need shouldnt be too different than the above? I’m just not sure how to extend it for a matrix (Jacobian) output.


@charlesm93 @Bob_Carpenter @rok_cesnovar

Anyone have any ideas?

Also - I forgot to say that currently (as a first step) i’m only interested in calculating the * diagonal * of the Jacobian - not sure if this makes things any easier (or less computationally intense) from an autodiff standpoint?

OK so I have since tried to get the following to get the first-order derivatives (only diagonal of Hessian needed, which as I mentioned is still using first-order reverse-mode autodiff as I have the gradient function hard-coded):

          for (int i = 0; i < n_params; i++) {

            std::vector<stan::math::var>  gradient_coeffs_var_temp;

            std::vector<double>  gradient_coeffs_std_temp;
            beta_grad_vec(i).grad(gradient_coeffs_var_temp, gradient_coeffs_std_temp);
            out_mat(i, 2)  =   gradient_coeffs_std_temp[0]; // store output here


However - only the first one (i.e. i = 0) is correct - the rest are way off. It seems like “grad” doesnt like to be called more than once / recursively - as if I start with i = 1 then only that one is correct, etc - is there a way to call “grad” recursvely like this for a vector function?

@syclik @Bob_Carpenter @WardBrian @wds15 @roualdes

@stevebronder knows the most about AD in Stan. It’s probably something to do with freeing the AD tape.

1 Like

If you’re already using external c++, then you can just use the existing Math library functions:

1 Like

Thanks @andrjohns @spinkney, I found the page on Jacobians with the link @andrjohns sent me (here) which is more relvant since i’ve got the gradient function manually coded. Will try and follow this and see if I can get it to work!

By the way - would it save any computation if I (somehow) only tell AD to work out the diagonal of the Jacobian as opposed to the entire matrix? if manually coding Jacobians (which I will do soon, just wanted to test things using AD first) then only computing the diagonal (often) can save a fair bit of computation - does it make any difference with AD?

you are missing to reset the adjoints to zero between grad calls. set_adjoints_zero

maybe have a look into the coupled_ode_functor file (filename is “approximate” here)

and also do not call the recover memory function inside the loop!

1 Like