When adding gradients for a distribution/function, is there any preference between the use of operands_and_partials or separate definitions for fwd and rev? It seems like operands_and_partials would lead to less code overall, but is there a performance cost?

I’ve been having a look at that, nice bit of coding! If we start using adj_jac_apply to add gradients, how is it going to affected when RHMC and forward mode is needed? Will there need to be a separate definition of the gradients for use with forward mode?

[image] andrjohns: ard mode is needed? Will there need to be a separate definition of the gradients for use with forward mode? This formulation is specifically for efficient reverse mode autodiff. In its current state, it won’t help forward/mixed mode autodiff :(.

[image] andrjohns: is there any preference between the use of operands_and_partials or separate definitions for fwd and rev ? Yes, there’s a bit of a performance cost for operands_and_partials. The important thing is to get things working first, then we can optimize later. But I’ll second @…

Adding gradients - operands and partials vs. fwd/rev

Developers

bbbales2 November 23, 2018, 5:51pm 2

Don’t even have to mess with autodiff types with adj_jac_apply: Adj_jac_apply – Yaaaay!

Another example: Adj_jac_apply

It only works for reverse mode though.

Topic		Replies	Views
F: R^N -> R^M - Jacobian for M >> N - is forward-mode more efficient? Algorithms	25	3006	June 30, 2017
Mixed mode OperandsAndPartials Developers	3	836	May 9, 2017
Soliciting syntax ideas for user defined gradients and user defined transformations Developers	9	1032	September 30, 2019
Speed of forward mode vs reverse mode Developers	12	1350	October 22, 2018
Vari class, operands_and_partials, or adj_jac_apply Developers math	2	507	January 23, 2020

Adding gradients - operands and partials vs. fwd/rev

Related topics