Our forward-mode autodiff is very well tested through the C++ math library level. And mostly through our non-matrix functions. So I’m still not sure what this gains us other than having it be someone else’s library rather than ours. We’re still going to want to test all of our functions.

It depends on the function and the order. With something like `fvar<var>`

, you want to make sure to call the `var`

version of `log_sum_exp`

internally. Then you get the usual efficiency (time and space both) from implementing analytic gradients. For example,

\frac{\partial}{\partial x} \log(\exp(x) + \exp(y)) = \frac{\exp(x)}{\exp(x) + \exp(y)}.

Writing a custom gradient function for \frac{\exp(x)}{\exp(x) + \exp(y)} would save a lot of evals, especially if we share evals for \frac{\partial}{\partial x} and \frac{\partial}{\partial y}.

Every time we save memory, we also save time in propagating gradients.