I have a function for which I have computed the first-order derivatives manually. I am looking at using autodiff to compute the second and third order derivatives using autodiff. As a first step, I am trying to work out how to do the second-order ones using autodiff, using the outputs from the first-order derivatives which were computed manually.
The code snippet below shows this for one column (the last column) of the Hessian matrix:
/////////////// second-order deriv's of log-density of likelihood (using autodiff)
/// differentiating gradient of last log diff w.r.t all log-diff's
std::vector<stan::math::var> theta_second_order;
for (int i = 0; i < n_log_diffs ; ++i) {
theta_second_order.push_back(logdiffs_var_vec[i]); // diff w.r.t all log-diffs
}
// gradient of deriv of last log-diff (last column of Hessian)
std::vector<double> second_order_deriv_of_last_log_diff;
stan::math::set_zero_all_adjoints();
deriv(K-2,0).grad(theta_second_order, second_order_deriv_of_last_log_diff); // differentiating deriv(K-2,0) (i.e. differentiating the deriv of last log-diff w.r.t all log-diffs)
stan::math::recover_memory();
Eigen::Matrix<double, -1, -1> hessian_vec_of_last_log_diff = mat_out_test(n_params,1);
for (int i = 0 ; i < n_log_diffs ; ++i) {
hessian_vec_of_last_log_diff(i,0) = second_order_deriv_of_last_log_diff[i];
}
// out
for (int k = 0; k < K-1 ; ++k) {
out_mat(k,6) = hessian_vec_of_last_log_diff(k,0);
}
This works. However, when I also try and compute the second-to-last column of the Hessian, it doesn’t work - it gives the same answers as the last column (which is incorrect):
/////////////// second-order deriv's of log-density of likelihood (using autodiff)
/// differentiating gradient of last log diff w.r.t all log-diff's
std::vector<stan::math::var> theta_second_order;
for (int i = 0; i < n_log_diffs ; ++i) {
theta_second_order.push_back(logdiffs_var_vec[i]); // diff w.r.t all log-diffs
}
// gradient of deriv of last log-diff (last column of Hessian)
std::vector<double> second_order_deriv_of_last_log_diff;
stan::math::set_zero_all_adjoints();
deriv(K-2,0).grad(theta_second_order, second_order_deriv_of_last_log_diff); // differentiating deriv(K-2,0) (i.e. differentiating the deriv of last log-diff w.r.t all log-diffs)
stan::math::recover_memory();
Eigen::Matrix<double, -1, -1> hessian_vec_of_last_log_diff = mat_out_test(n_params,1);
for (int i = 0 ; i < n_log_diffs ; ++i) {
hessian_vec_of_last_log_diff(i,0) = second_order_deriv_of_last_log_diff[i];
}
// out
for (int k = 0; k < K-1 ; ++k) {
out_mat(k,6) = hessian_vec_of_last_log_diff(k,0);
}
// gradient of deriv of second-to-last log-diff (second-to-last column of Hessian)
std::vector<double> second_order_deriv_of_second_to_last_log_diff;
stan::math::set_zero_all_adjoints();
deriv(K-3,0).grad(theta_second_order, second_order_deriv_of_second_to_last_log_diff); // differentiating deriv(K-2,0) (i.e. differentiating the deriv of last log-diff w.r.t all log-diffs)
stan::math::recover_memory();
Eigen::Matrix<double, -1, -1> hessian_vec_of_second_to_last_log_diff = mat_out_test(n_params,1);
for (int i = 0 ; i < n_log_diffs ; ++i) {
hessian_vec_of_second_to_last_log_diff(i,0) = second_order_deriv_of_second_to_last_log_diff[i];
}
// out
for (int k = 0; k < K-1 ; ++k) {
out_mat(k,7) = hessian_vec_of_second_to_last_log_diff(k,0);
}
However - if I comment out the code which computes the last column, then it gives the correct output for the second-to-last column
So it seems that the code which computes the first column is interfering with the code which computes the second - what’s the correct way to do this?
Also, once I have computed all of the second-order derivatives, will the third-order ones need to be computed using forward-mode autodiff rather than reverse-mode?