And here’s a follow up post with a more interesting adj_jac_apply performance example.
In this case I compared a prim version of simplex_constrain (https://github.com/stan-dev/math/blob/develop/stan/math/prim/mat/fun/simplex_constrain.hpp#L28) to one implemented with adj_jac_apply (https://github.com/stan-dev/math/blob/develop/stan/math/rev/mat/fun/simplex_constrain.hpp#L15).
The benchmark code is here: https://gist.github.com/bbbales2/a1689764f0fda6df561e858026f4e8d9#file-adj_jac_apply_simplex_benchmark_test-cpp
This problem has a triangular Jacobian, but you can compute the stuff you need recursively so again you’re better off not computing that Jacobian or even multiplying by it implicitly.
It looks like a healthy 4x speedup, which is nice. Via a visual inspection both of those lines are straight (though it’s hard to tell for the teal one).
The good news to that is that again the regular autodiff is scaling really nicely. The bad news is that this means we’re gonna need to be pretty careful with any custom derivatives stuff to not end up slower or scaling worse :P.