Is this an ok solution to make this test pass?

I have a long-suffering PR that seems to be failing just because our testing framework has problems with NaN. Here’s a quick fix to the issue that will still fail the test if the expected gradient is NaN but the gradient is not Nan. Is this an acceptable test for making the PR go through? This is mostly a question @syclik and @Bob_Carpenter who’ve reviewed the PR so far but if somebody else sees a better place to relatively quickly patch the test framework I’d take pointers.

for (size_t i = 0; i < expected_gradients.size(); i++) {
-      EXPECT_FLOAT_EQ(expected_gradients[i], gradients[i])
-          << "Comparison of expected gradient to calculated gradient failed";
+      if (!stan::math::is_nan(expected_gradients[i])) {
+        EXPECT_FLOAT_EQ(expected_gradients[i], gradients[i])
+            << "Comparison of expected gradient to calculated gradient failed";
+      } else {
+        EXPECT_TRUE(!(expected_gradients[i] == expected_gradients[i]) 
+          && !(gradients[i] == gradients[i]))
+            << "Expected gradient is NaN but gradient is non NaN.";
+      }
+          
     }
   }

Would this do the same thing (edit: in the else clause)

EXPECT_TRUE(stan::math::is_nan(gradients[i])) 
+            << "Expected gradient is NaN but gradient is non NaN.";

cause if it is NaN, then life is good (the expected and actual are the same thing)? And we know expected_gradients[i] is NaN if we get there? I dunno about the actual test framework.

The PR in question is this one: https://github.com/stan-dev/math/pull/533

I don’t think we need to be changing the framework to make this work. I say that because the framework also has explicit tests for NaN (both from the same file):

But… if I understand correctly, the value is not NaN, but the gradients are?

Do you know where this fails (without your patch)? It looked like it was some distribution test, but I couldn’t find exactly which one.

Yeah that would be cleaner, I think what I wrote is missing a case.

With this framework I’m not sure how to check!

It’s the inverse chi square lccdf, which does use grad_reg_inc_gamma and gamma_p, and so their gradients which I probably did touch to get tests to pass.

This “one”:

 ./runTests.py -j4 test/prob/inv_chi_square/inv_chi_square_ccdf_log_00000_generated_ffv_test

The problem I’ve run into is that the changes I made were because Bob’s framework for testing requires parallel behavior w.r.t. NaN for functions and their gradients (and our math functions don’t always throw so the gradients couldn’t throw). So for example in grad_reg_inc_gamma the checks are:

+  if (is_nan(a))
+    return std::numeric_limits<T1>::quiet_NaN();
+  if (is_nan(z))
+    return std::numeric_limits<T2>::quiet_NaN();
+  if (is_nan(g))
+    return std::numeric_limits<T1>::quiet_NaN();
+  if (is_nan(dig))
+    return std::numeric_limits<T1>::quiet_NaN();

For example our gamma_p uses boost’s gamma_p without modifying the error policy so it returns NaN on a <= 0 or z < 0 … but it could just all throw if we used a consistent policy

Thanks for giving me that example. I’m trying to figure out how it triggers those NaNs. Here’s what I’ve done so far. I put this into a test called test/inv_chi_square_test.cpp and am running it with ./runTests.py test/inv_chi_square_test.cpp. I don’t see it failing (yet), so I need to dig into how the failing distribution test is instantiating it.

#include <stan/math/mix/scal.hpp>
#include <gtest/gtest.h>


double ccdf_log = std::log(1.0 - 0.317528038297796704186230);
double d_dy =  -0.0728606;
double d_dnu = -0.83451658;
double d_dy2 = 0.021002101;
double d_dnu2 = -0.25922951;


TEST(inv_chi_square_ccdf, prim) {
  double y = 3.0;
  double nu = 0.5;
  double result = stan::math::inv_chi_square_ccdf_log(y, nu);
  EXPECT_FLOAT_EQ(ccdf_log, result);
}

TEST(inv_chi_square_ccdf, rev1) {
  stan::math::var y = 3.0;
  double nu = 0.5;
  stan::math::var result = stan::math::inv_chi_square_ccdf_log(y, nu);

  std::vector<stan::math::var> vars;
  std::vector<double> grad;
  vars.push_back(y);
  result.grad(vars, grad);


  EXPECT_FLOAT_EQ(ccdf_log, result.val());
  EXPECT_FLOAT_EQ(d_dy, grad[0]);

  stan::math::recover_memory();
}

TEST(inv_chi_square_ccdf, rev2) {
  double y = 3.0;
  stan::math::var nu = 0.5;
  stan::math::var result = stan::math::inv_chi_square_ccdf_log(y, nu);

  std::vector<stan::math::var> vars;
  std::vector<double> grad;
  vars.push_back(nu);
  result.grad(vars, grad);

  EXPECT_FLOAT_EQ(ccdf_log, result.val());
  EXPECT_FLOAT_EQ(d_dnu, grad[0]);

  stan::math::recover_memory();
}

TEST(inv_chi_square_ccdf, rev3) {
  stan::math::var y = 3.0;
  stan::math::var nu = 0.5;
  stan::math::var result = stan::math::inv_chi_square_ccdf_log(y, nu);

  std::vector<stan::math::var> vars;
  std::vector<double> grad;
  vars.push_back(y);
  vars.push_back(nu);
  result.grad(vars, grad);

  EXPECT_FLOAT_EQ(ccdf_log, result.val());
  EXPECT_FLOAT_EQ(d_dy, grad[0]);
  EXPECT_FLOAT_EQ(d_dnu, grad[1]);

  stan::math::recover_memory();
}


TEST(inv_chi_square_ccdf, fwd1) {
  stan::math::fvar<double> y(3.0, 1.0);
  double nu = 0.5;
  stan::math::fvar<double> result = stan::math::inv_chi_square_ccdf_log(y, nu);

  EXPECT_FLOAT_EQ(ccdf_log, result.val_);
  EXPECT_FLOAT_EQ(d_dy, result.d_);
}

TEST(inv_chi_square_ccdf, fwd2) {
  double y = 3.0;
  stan::math::fvar<double> nu(0.5, 1.0);
  stan::math::fvar<double> result = stan::math::inv_chi_square_ccdf_log(y, nu);

  EXPECT_FLOAT_EQ(ccdf_log, result.val_);
  EXPECT_FLOAT_EQ(d_dnu, result.d_);
}


TEST(inv_chi_square_ccdf, mix_1stDeriv) {
  stan::math::fvar<stan::math::var> y(3.0, 1.0);
  double nu = 0.5;
  stan::math::fvar<stan::math::var> result = stan::math::inv_chi_square_ccdf_log(y, nu);

  EXPECT_FLOAT_EQ(ccdf_log, result.val_.val());
  EXPECT_FLOAT_EQ(d_dy, result.d_.val());

  std::vector<stan::math::var> vars;
  std::vector<double> grad;
  vars.push_back(y.val_);
  result.val_.grad(vars, grad);

  EXPECT_FLOAT_EQ(d_dy, grad[0]);
  stan::math::recover_memory();
  // for (auto g : grad) {
  //   std::cout << "gradient = " << g << std::endl;
  // }
}

TEST(inv_chi_square_ccdf, mix_2ndDeriv1) {
  stan::math::fvar<stan::math::var> y(3.0, 1.0);
  double nu = 0.5;
  stan::math::fvar<stan::math::var> result = stan::math::inv_chi_square_ccdf_log(y, nu);

  EXPECT_FLOAT_EQ(ccdf_log, result.val_.val());
  EXPECT_FLOAT_EQ(d_dy, result.d_.val());

  std::vector<stan::math::var> vars;
  std::vector<double> grad;
  vars.push_back(y.val_);
  result.d_.grad(vars, grad);

  EXPECT_FLOAT_EQ(d_dy2, grad[0]);
  stan::math::recover_memory();
}

TEST(inv_chi_square_ccdf, mix_2ndDeriv2) {
  double y = 3.0;
  stan::math::fvar<stan::math::var> nu(0.5, 1.0);
  stan::math::fvar<stan::math::var> result = stan::math::inv_chi_square_ccdf_log(y, nu);

  EXPECT_FLOAT_EQ(ccdf_log, result.val_.val());
  EXPECT_FLOAT_EQ(d_dnu, result.d_.val());

  std::vector<stan::math::var> vars;
  std::vector<double> grad;
  vars.push_back(nu.val_);
  result.d_.grad(vars, grad);

  EXPECT_FLOAT_EQ(d_dnu2, grad[0]);
  stan::math::recover_memory();
  // for (auto g : grad) {
  //   std::cout << "gradient = " << g << std::endl;
  // }
}

TEST(inv_chi_square_ccdf, mix_3rdDeriv_1) {
  stan::math::fvar<stan::math::fvar<stan::math::var>> y;
  y.val_.val_ = 3.0;
  y.val_.d_ = 1.0;
  y.d_.val_ = 1.0;
  double nu = 0.5;
  stan::math::fvar<stan::math::fvar<stan::math::var>> result = stan::math::inv_chi_square_ccdf_log(y, nu);

  EXPECT_FLOAT_EQ(ccdf_log, result.val_.val_.val());
  EXPECT_FLOAT_EQ(d_dy, result.d_.val_.val());

  std::vector<stan::math::var> vars;
  std::vector<double> grad;
  vars.push_back(y.val_.val_);
  result.val_.val_.grad(vars, grad);

  // EXPECT_FLOAT_EQ(d_dnu2, grad[0]);
  stan::math::recover_memory();
  for (auto g : grad) {
    std::cout << "gradient = " << g << std::endl;
  }
}

TEST(inv_chi_square_ccdf, mix_3rdDeriv_2) {
  stan::math::fvar<stan::math::fvar<stan::math::var>> y;
  y.val_.val_ = 3.0;
  y.val_.d_ = 1.0;
  y.d_.val_ = 1.0;
  double nu = 0.5;
  stan::math::fvar<stan::math::fvar<stan::math::var>> result = stan::math::inv_chi_square_ccdf_log(y, nu);

  EXPECT_FLOAT_EQ(ccdf_log, result.val_.val_.val());
  EXPECT_FLOAT_EQ(d_dy, result.d_.val_.val());

  std::vector<stan::math::var> vars;
  std::vector<double> grad;
  vars.push_back(y.val_.val_);
  result.d_.val_.grad(vars, grad);

  // EXPECT_FLOAT_EQ(d_dnu2, grad[0]);
  stan::math::recover_memory();
  for (auto g : grad) {
    std::cout << "gradient = " << g << std::endl;
  }
}

TEST(inv_chi_square_ccdf, mix_3rdDeriv_3) {
  stan::math::fvar<stan::math::fvar<stan::math::var>> y;
  y.val_.val_ = 3.0;
  y.val_.d_ = 1.0;
  y.d_.val_ = 1.0;
  double nu = 0.5;
  stan::math::fvar<stan::math::fvar<stan::math::var>> result = stan::math::inv_chi_square_ccdf_log(y, nu);

  EXPECT_FLOAT_EQ(ccdf_log, result.val_.val_.val());
  EXPECT_FLOAT_EQ(d_dy, result.d_.val_.val());

  std::vector<stan::math::var> vars;
  std::vector<double> grad;
  vars.push_back(y.val_.val_);
  result.d_.d_.grad(vars, grad);

  // EXPECT_FLOAT_EQ(d_dnu2, grad[0]);
  stan::math::recover_memory();
  for (auto g : grad) {
    std::cout << "gradient = " << g << std::endl;
  }
}

@sakrejda, please hold off on trying to find this. I think there are memory issues in the tests themselves. I don’t see how this has worked so far.

I’m instrumenting the test fixture and finding things like it’s using the wrong underlying vi_ to take the derivative with respect to (some debug output)

var to take deriv of = 3:0 -> 0x7ffee2e8e4f0 
...
x[0].vi_ = 0x7f974f703fa0

I expected those to match, but clearly they don’t.

2 Likes

This seems like important work! Which tests are these?

This one:

./runTests.py -j4 test/prob/inv_chi_square/inv_chi_square_ccdf_log_00000_generated_ffv_test

Woops, I meant in which category of tests does @syclik suspect the memory error exists. All distribution tests? Or just the one you pasted? Or all the ones using some other function… etc.

I think it’s in the mixed distribution tests.

Inspecting the code, it looks like we construct a set of stan::math::vars to be used in two different function calls, but after the first, we call stan::math::recover_memory(), which should destroy the validity of all the stan::math::vars for the second call. I think that’s legitimately a problem with our tests, which probably doesn’t show up because we don’t do any manipulation of the stack between calls to .grad(...), but it’s still an error.

That said, that’s not the problem with this particular pull request. There’s something really funky going on. Changing this particular derivative is the only one that triggers this particular NaN issue. Before moving forward, I’d really want to know why it’s a problem in the tests (and why it only happens with this custom derivative!) or what’s different about this particular code that triggers it.

That’s a reasonable request, I probably need to break up this PR and start with the new tests vs. develop and do this incrementally.

I’m not expecting you to do this! I’ll try to hunt it down.

Of course, I’m not going to stop you! But I think your code looks right. We just know that it triggers something weird and it’s deeper than what it looks like on the surface.

Thanks for dealing with this, @syclik. Let me know if something needs to change in the test framework I can help with. I’m trying to think about building it out further for the rest of our functions, the testing of which is a serious bottleneck.

You can’t stop me! Blah, that previous message was crap, forgot parens :/

1 Like

So this nan shows up when stan::length(nu) == 1 and is_constant_struct<T_dof>::value == 1 only in:

AgradCcdfLogInvChiSquare_ffv_9/AgradCcdfLogTestFixture/0.Function

So the type is: <double, empty, empty, empty, empty>

Doesn’t happen in ffv_10, where the type is: <std::vector, empty, empty, empty, empty>

So I’m going to bet there’s a meta-program not handling doubles right somewhere.

… but everything I can find treats doubles fine so it’s still a mystery for now.