MXNet white paper "Auto-Differentiating Linear Algebra"

Bob_Carpenter · July 17, 2018, 6:23pm

First of all, what makes you think they’re more efficient for the kinds of computations we do in Stan? For example, Stan is much faster than Greta, even though Greta uses TensorFlow for autodiff.

Second, I’m not sure what you mean by “dedicated framework for deep learning”. Are you talking about something like Keras that specifies deep neural networks, or something more flexible like Edward (general static autodiff from TensorFlow) or even more flexible like Pyro (general dyanmic autodiff from PyTorch)?

It’s almost always easier to optimize for a specific application. If you only need to do logistic backprop, the derivatives are all simple arithmetic and it’d be silly to use autodiff to calculate them.

P.S. If you were worried about Edward reporting being faster than Stan, that was all for a very simple logistic regression involving an easily parallelizable matrix multiply. As soon as Stan’s GPU and MPI code lands, we’ll be competitive and probably faster than Edward because we’ve optimized so much of our underlying code for statistics. Of course, they’re not standing still, and have all of our work to build on. So I expect this all to be fun going forward. I think most of the people building these systems are on good terms with one another.

Topic		Replies	Views
Eigen-AD: Algorithmic Differentiation of the Eigen Library Developers math	15	4284	December 19, 2019
Understanding autodiff's impact on efficiency Modeling	8	935	July 31, 2019
Stan-math: Eigen support General stan-math	13	1851	July 4, 2017
Efficient static_matrix type? General autodiff , matrix	20	2448	November 22, 2018
Autodiff expression graph optimizations Developers features	4	1737	April 18, 2018

MXNet white paper "Auto-Differentiating Linear Algebra"

Related topics