@betanalpha I wanted to show you the errors in the gradient I mentioned at the meeting. Look at the first plot here:
That’s log10 of absolute error of Stan gradient (as currently on
develop unless somebody else jumped in there to fix it) as a function of the two parameters (a on y-axis, z on x-axis) versus gradients from Mathematica. The later plots are actually dated as the better implementations don’t have the failure region near small
This looks pretty good until both arguments are approaching 20, which covers most applications (especially since the gamma can be rescaled). Not that it can’t be made better, just that this doesn’t look all that bad. Although relative error would be a better quantification of the practical implications.
I should’ve posted this link instead: https://github.com/stan-dev/math/files/955767/d_gamma_p_da.pdf
I’m not sure how this error is ok, either absolute or relative given we’re testing functions to 1e-8 or so for most functions and the absolute error is around .01 or .001 well before parameter values. Also I have models that fail without the improved calculation. If we can’t figure it out here I’ll just put a topic on for the meeting so we don’t have to keep going back and forth.