`expect_near_rel` behaves weirdly for values around 1e-8

martinmodrak · January 10, 2020, 2:34pm

Inspired by @Bob_Carpenter I started using expect_near_rel instead of various manual solutions to test “approximate equivalence” in my tests for Stan math. I have however become slightly suspect of its behavior recently and want to discuss whether it’s implementation shouldn’t change (especially if we want to encourage it’s use across the codebase).

The relevant part of the code is:

void expect_near_rel_finite(const std::string& msg, const T1& x1, const T2& x2,
                            double tol = 1e-8) {
  using stan::math::fabs;
  // if either arg near zero, must use absolute tolerance as rel tol -> 2
  if (fabs(x1) < tol || fabs(x2) < tol) {
    EXPECT_NEAR(x1, x2, tol) 
    return;
  }
  auto avg = 0.5 * (fabs(x1) + fabs(x2));
  auto relative_diff = (x1 - x2) / avg;
  EXPECT_NEAR(0, relative_diff, tol)

This however behaves strangely for values close to the tolerance (1e-8 by default), where a steep change in required precision happens. For example, I now have a failing test with:

actual:    1.2019658157669255e-008, 
expected:  1.2019540397111151e-008, 
relative tolerance = 1e-008; relative diff = 9.7973780478180374e-006

This is OK to fail, but if I had actual = 0.9e-008, the test would suddenly pass, even though the difference just increased.

Some discussion on the topic is at Relative delta in finite difference approximations of derivatives? where @anon75146577 recommends using max(epsilon, epsilon*|x|) as a tolerance, which doesn’t have this problem, but might be unnecessarily lax when we care about small numbers. I’ve previously used something like max(1e-14, 1e-8*|x|).

So I am asking whether there is some good reason expect_near_rel is implemented as it is and whether it might be sensible to implement it differently. expect_near_rel is currently used in 21 tests, so it should not be a major change. It is also possible that expect_near_rel is tailored to the autodiff case and shouldn’t be used elsewhere…

Thanks for any input!

bbbales2 · January 10, 2020, 11:06pm

Aren’t the current tests kinda lax for small numbers too?

martinmodrak · January 12, 2020, 7:50am

Possibly yes. My motivation is that I am trying to handle numerical inaccuracies in the current implementation of neg_binomial_2 and I am seriously scared of messing something up, so I want to make sure I do the “right thing” in my tests.

bbbales2 · January 12, 2020, 6:20pm

What I mean by that is:

EXPECT_NEAR(x1, x2, tol)

Doesn’t that just make sure abs(x1 - x2) < tol? If so then a comparison of something like 1e-16 vs. 1.5e-16 has the same problems as abs(x1 - x2) < max(epsilon, epsilon * abs(x)) right?

Bob_Carpenter · January 15, 2020, 2:27am

Nope. Just my best shot. I don’t think I understood Dan’s suggestion or I would’ve implemented it.

It’s the basis of expect_ad, so it’s used in a whole lot more than that indirectly.

Yes.

martinmodrak · January 15, 2020, 1:55pm

Disclaimer: I just taught myself some numerics in the past few months, I might easily be very wrong

I’ve been thinking a bit about this, and now I most like to make the tolerance depend on the average as:

auto avg = 0.5 * (fabs(x1) + fabs(x2));
tol_final = max(tol_min, tol * avg);
EXPECT_NEAR(0, x1 - x2, tol_final);

here both tol and tol_min can be set by the user and the default for tol_min is tol_rel ^ 2.

The proposed approach gives the same relative tolerance as the current approach when avg >= tol, but then increases the tolerance more gradually.

Graphically, the relative tolerance implied by the current and the proposed approach when tol = 1e-8 looks like this:

The code I’ve written for my recent pull request passes this more stringent standard, but some other tests fail - I am not sure if this is cause for concern or simply means that the tolerances need to be better adjusted.

Does that make sense as a recommended practice? Or am I missing something obvious?

Code for the plot

library(ggplot2)

x <- 10^(seq(-15,5, length.out = 100))
tol_current <- ifelse(x < 1e-8, 1e-8, x* 1e-8)
tol_proposed <- pmax(x*1e-8,1e-16)
df = rbind(
  data.frame(x = x, type = "current", tol = tol_current),
  data.frame(x = x, type = "proposed", tol = tol_proposed))
df$relative_tol = df$tol / df$x

ggplot(df, aes(x = x, y = relative_tol, color = type)) + geom_line() + scale_x_log10("0.5 * (fabs(x1) + fabs(x2))")+ scale_y_log10()

Bob_Carpenter · January 23, 2020, 7:09pm

Join the club!

I don’t know. I have very poor intuitions about this kind of thing. But I don’t have issues with changing things to stricter tolerances. We can just lower the tolerance of things that fail. The problem, I think, is that many of our functions don’t have great relative tolerance behavior around zero.

Maybe @anon75146577 has an opinion.

martinmodrak · January 29, 2020, 3:43pm

I filed an issue for this https://github.com/stan-dev/math/issues/1656 and I am going to test what happens if the behavior changes.

martinmodrak · January 31, 2020, 5:24pm

So the change seems relatively painless, I’ve started a PR https://github.com/stan-dev/math/pull/1657 happy to get any feedback.

Topic		Replies	Views
Relative delta in finite difference approximations of derivatives? Algorithms	17	3783	August 27, 2019
Unit test for differentiation Developers	6	772	January 28, 2020
Causes for numerical discrepancies, or, help me figure out how these two model CPP files differ Developers	3	388	June 24, 2019
Discussing tests in Stan math Developers math	5	854	January 28, 2020
Variable precision causing bug? General	0	408	March 13, 2019

`expect_near_rel` behaves weirdly for values around 1e-8

Related topics