Adding gradients - operands and partials vs. fwd/rev

andrjohns · November 23, 2018, 1:15am

When adding gradients for a distribution/function, is there any preference between the use of operands_and_partials or separate definitions for fwd and rev?

It seems like operands_and_partials would lead to less code overall, but is there a performance cost?

bbbales2 · November 23, 2018, 5:51pm

Don’t even have to mess with autodiff types with adj_jac_apply: Adj_jac_apply – Yaaaay!

Another example: Adj_jac_apply

It only works for reverse mode though.

andrjohns · November 25, 2018, 9:24am

I’ve been having a look at that, nice bit of coding!

If we start using adj_jac_apply to add gradients, how is it going to affected when RHMC and forward mode is needed? Will there need to be a separate definition of the gradients for use with forward mode?

bbbales2 · November 26, 2018, 12:30am

This formulation is specifically for efficient reverse mode autodiff. In its current state, it won’t help forward/mixed mode autodiff :(.

Bob_Carpenter · November 29, 2018, 2:50am

Yes, there’s a bit of a performance cost for operands_and_partials. The important thing is to get things working first, then we can optimize later.

But I’ll second @bbbales2 comment about adjoint-Jacobian-apply. That’ll make the reverse mode efficient.

In most cases, we just use the templated definition in prim for forward-mode. It’s not the most efficient approach, but it works. When we actually start using forward mode in practice, we’ll want to start improving forward mode efficiency. We haven’t spent much time there at all.

Topic		Replies	Views
Mixed mode OperandsAndPartials Developers	3	781	May 9, 2017
Soliciting syntax ideas for user defined gradients and user defined transformations Developers	9	939	September 30, 2019
Adj_jac_apply Developers	37	2453	April 11, 2020
Speed of forward mode vs reverse mode Developers	12	1230	October 22, 2018
Operands and Partials: partials_ vs partials_vec_ Developers	10	1040	February 3, 2018

Adding gradients - operands and partials vs. fwd/rev

Related topics