Hi all,
I’m a first-year Ph.D. student in Sharad Goel’s lab at Stanford. @charlesm93 has generously been explaining an idea he has for speeding up differentiation in the algebraic solver. (I.e., the topic posted here.) Over the next few weeks, I’m planning to implement this alternative method of propagating derivatives through implicit solutions to algebraic equations.
The basic idea is to use the implicit function theorem to get an “almost adjoint” method. Charles has a nice writeup here. Just at a high level, given a target scalar j(u, v), with u \in \mathbb{R}^p, v \in \mathbb{R}^n, and v depending implicitly on u via f : \mathbb{R}^{p + n} \to \mathbb{R}^n by requiring that f(u, v) = 0, we can calculate
and further simplify, using the implicit function theorem,
The key insight is that if we can calculate (with a matrix solve)
then it only requires one additional reverse-mode sweep to calculate
The hard part is calculating the Jacobian \left[ \frac {\partial f} {\partial v} \right]. This can either be done with n forward-mode sweeps, or possibly gotten for free from the solver itself.
Looking forward to working with all of you! Hopefully this will result in some speedup.