Algebraic Solver Differentiation Speedup

Hi all,

I’m a first-year Ph.D. student in Sharad Goel’s lab at Stanford. @charlesm93 has generously been explaining an idea he has for speeding up differentiation in the algebraic solver. (I.e., the topic posted here.) Over the next few weeks, I’m planning to implement this alternative method of propagating derivatives through implicit solutions to algebraic equations.

The basic idea is to use the implicit function theorem to get an “almost adjoint” method. Charles has a nice writeup here. Just at a high level, given a target scalar j(u, v), with u \in \mathbb{R}^p, v \in \mathbb{R}^n, and v depending implicitly on u via f : \mathbb{R}^{p + n} \to \mathbb{R}^n by requiring that f(u, v) = 0, we can calculate

\frac {dj} {du} = \frac {\partial j} {\partial u} + \frac {\partial j} {\partial v} \frac {dv} {du}

and further simplify, using the implicit function theorem,

\frac {dv} {du} = - \left[ \frac {\partial f} {\partial v} \right]^{-1} \frac {\partial f} {\partial u}.

The key insight is that if we can calculate (with a matrix solve)

\eta = - \left[ \frac {\partial f} {\partial v} \right]^{-T} \left[ \frac {\partial j} {\partial v} \right]^T,

then it only requires one additional reverse-mode sweep to calculate

\frac {\partial j} {\partial v} \frac {dv} {du} = \eta^T \frac {\partial f} {\partial u}.

The hard part is calculating the Jacobian \left[ \frac {\partial f} {\partial v} \right]. This can either be done with n forward-mode sweeps, or possibly gotten for free from the solver itself.

Looking forward to working with all of you! Hopefully this will result in some speedup.


@jgaeb I think everyone on the Stan dev team agrees this is a feature we should try out. The next step is to post an issue on

and then work out the details. I propose we start with the newton solver, although both solvers should admit the same differentiation.