Is this an ok solution to make this test pass?

sakrejda · February 26, 2018, 4:59pm

I went through my code to see if there was a way to break things up but I can’t add tests first since the current code fails under Bob’s auto-diff test framework and if I introduce any of my code the old tests fail (for some reason I think the actual reference values are not accurate in the old tests). Is there anything you could do to narrow down the memory issue you pointed out? This one:

I could take a look at it if you give me a place to start but there’s not enough info in your message to figure out what you found.

syclik · February 27, 2018, 2:19pm

Sorry about that – I didn’t actually commit any of it yet. This is the first thing I’m doing on Stan this week.

Btw, mind recreating a PR for that branch?

syclik · February 27, 2018, 2:26pm

I’ve just updated the branch:

reverted the revert commit (using git revert)
reverted the change to the test (as a separate commit)

syclik · February 27, 2018, 2:31pm

For anyone following along, this will make and run just the test that fails. It helps to get quicker access to results:

make test/prob/inv_chi_square/inv_chi_square_ccdf_log_00000_generated_ffv_test; ./test/prob/inv_chi_square/inv_chi_square_ccdf_log_00000_generated_ffv_test --gtest_filter=AgradCcdfLogInvChiSquare_ffv_9/AgradCcdfLogTestFixture/0.Function

syclik · February 27, 2018, 2:58pm

@sakrejda, sorry, but there are legitimate issues in the gamma_p implementation. I’m coming up with a minimal example. Stay tuned.

sakrejda · February 27, 2018, 3:02pm

Np, if there’s a real problem to look at I’m sure I can fix it.

syclik · February 27, 2018, 3:07pm

Just pop this into a test somewhere:

#include <stan/math/mix/mat.hpp>
#include <gtest/gtest.h>

TEST(test, works) {
  using stan::math::fvar;
  using stan::math::var;
  using std::vector;

  fvar<fvar<var>> y = 0.17;
  fvar<fvar<var>> fy = gamma_p(0.25, y);

  vector<double> grad;
  vector<var> x;
  x.push_back(y.val_.val_);
  fy.val_.val_.grad(x, grad);
  stan::math::recover_memory();

  ASSERT_EQ(1, grad.size());
  EXPECT_FALSE(stan::math::is_nan(grad[0]))
      << "  - grad[0] = " << grad[0];
  EXPECT_FLOAT_EQ(0.878926, grad[0]);
}

TEST(test, fails) {
  using stan::math::fvar;
  using stan::math::var;
  using std::vector;

  fvar<fvar<var>> y = 0.17;
  fvar<fvar<var>> fy = 1.0 - gamma_p(0.25, y); // this fails

  vector<double> grad;
  vector<var> x;
  x.push_back(y.val_.val_);
  fy.val_.val_.grad(x, grad);
  stan::math::recover_memory();

  ASSERT_EQ(1, grad.size());
  EXPECT_FALSE(stan::math::is_nan(grad[0]))
      << "  - grad[0] = " << grad[0];
  //EXPECT_FLOAT_EQ(, grad[0]);
}

syclik · February 27, 2018, 3:07pm

I haven’t figured out exactly what’s going on to create that NaN, but it’s clearly not right.

sakrejda · February 27, 2018, 3:14pm

Ooh, I have some guesses but no computer at the moment. If you don’t see an obvious solution let me take a look before you sink time into it.

syclik · February 27, 2018, 3:16pm

I don’t see anything obvious. If it were, I’d have fixed it! =)

Damn, I’m glad we have these tests.

sakrejda · February 27, 2018, 3:47pm

I’m guessing it’s either the more limited type promotion or the boundaries between implementations but I’ll check.

sakrejda · February 27, 2018, 7:50pm

Found it! It wasn’t subtle :(

sakrejda · February 27, 2018, 7:56pm

Turns out adj_ can be negative… which… yeah… makes sense… so you shouldn’t log it… all the direct-input parameters were positive so I got lulled into a false sense of security about logs…

Bob_Carpenter · February 28, 2018, 10:36pm

Wow. How’d you wind up taking the log of a negative? I’d have thought that wouldn’t come up with an ordinary function’s gradients.

sakrejda · February 28, 2018, 10:55pm

Why? Are you saying adj_ should never be negative?

Bob_Carpenter · February 28, 2018, 10:57pm

No. I’d have thought that if you were calculating gradients using the chain rule that you wouldn’t wind up taking the log of a negative number. Are you writing the actual derivative code yourself?

sakrejda · February 28, 2018, 11:05pm

Yeah, this is in rev, there was something like adj_ * exp(a)/tgamma(b) and I replaced it with exp(log(adj_) + a - lgamma(b)) to make some more digits of accuracy and make tests pass but it should’ve just been adj_ * exp(a - lgamma(b)). Having chased down the rest of the bugs and introduced automatic testing I’d have to go back and doublecheck that it’s necessary.

IIRC… I preferred lgamma over tgamma in this code b/c sometimes tgamma would go to inf when lgamma doesn’t. It’s often just the denominator that has tgamma so it’s better to let the final expression overflow to zero rather than throwing the whole calculation due to an inf.

Bob_Carpenter · March 1, 2018, 9:54pm

I’d think that’d be much more stable. tgamma has a pretty limited domain.

Topic		Replies	Views
Gradient evaluated at the initial value is not finite although the lp is finite Modeling cognitive-science	12	4041	April 13, 2020
Intermediate NaN evaluation tanks the autodiff stack Developers math	2	637	February 17, 2018
Integrate_1d in likelihood computation produces NaNs in the gradient Modeling integrator	18	1269	February 5, 2021
Known gradient breaking behaviours? Developers	8	719	July 30, 2019
Mixed mode tests Developers	7	1016	June 18, 2017

Is this an ok solution to make this test pass?

Related topics