I’ll give it a shot. I just took the easy way out because I was wrestling with so many type errors on the way to getting everyting to compile.
The errors are relative unless one of the values is very small at which point it kicks over to absolute errors. The tolerance is set around 1e-8 for gradients, but by the time you get to Hessians with finite differences (and they’re based on three finite diffs, not just one in a band, which is partly why they’re so slow), things are much less stable arithmetically. The tolerance there is 1e-3 and often has to go down to 5e-3 or 1e-2. Longer term, I think the right thing to do is instead do finite differences of gradients, since the gradients are being independently tested. Same for third order derivatives, but those are so often zero that it’s not nearly as much of a problem for testing as Hessians.
We could also use more of the algebraic tests if people are concerned about precision. I just figured if Hessians were right up to 1e-2 they were probably actually right. That’s obviously not a watertight line of reasoning.
I don’t like that they can fail randomly and I don’t like that the coverage is random. I’d rather have someone think out the edge cases and test those.
I’m one of those people. The reasons are that They’re unruly in that they can fail randomly, especially with finite diffs. Isn’t it also hard to actually get edge tests like highly ill-conditioned matrices randomly?