Running tests with sanitizers

seantalts · February 9, 2018, 4:07am

We talked a lot in the meeting today about, essentially, the need for further automation and test coverage. I started looking into a few things we can do to reduce the number of bugs. Pretty high on my list are clang-tidy (which I am extremely interested in using to modernize our source code, maybe even automating that, but after MPI and GPU stuff lands). The second thing is running our unit tests with Clang’s sanitizers. There are two, and I started running them with just the Address sanitizer first and got this error message:

This seems maybe bad? @Bob_Carpenter any thoughts? It’s during a test that is supposed to be recovering memory, but it seems to be reading at an inappropriate offset into an allocated block…

yizhang · February 9, 2018, 4:56am

I noticed something similar the other day debugging using valgrind memcheck, didn’t have to time to follow up though.

bgoodri · February 9, 2018, 5:31am

In the past, whenever we ran things under sanitizers there were bugs, so we stopped running them.

seantalts · February 9, 2018, 2:51pm

That doesn’t seem great…

Bob_Carpenter · February 10, 2018, 8:00pm

You mean false positives in the sanitizers?

One problem we have is that the autodiff stacks intentionally “leaks” memory in some way in that it’s never collected until there’s an explicit call to clean it up.

Do things like this show up outside of that ODE solver? If not, then I’d be very worried about allocation inside the ODE solver.

Bob_Carpenter · February 10, 2018, 8:01pm

I already don’t recognize our source code! Seriously, though, I’d be all for modernizing it as long as we stay within the bounds of our supported compilers.

bgoodri · February 10, 2018, 8:24pm

https://clang.llvm.org/docs/SanitizerSpecialCaseList.html

seantalts · February 12, 2018, 2:32pm

Out of all of test/unit/math/rev, the only place the sanitizer finds a problem is in this StanAgradRevOde.memory_recovery_dv test. Full log attached.
sanitized.txt (319.0 KB)

seantalts · February 12, 2018, 3:45pm

I ran the full unit tests with the sanitizer and there are a couple of other errors but they all happen after tear down, which makes me suspect there’s a higher chance of them being spurious than the ODE one, which happens during a test run. full log: sanitized.txt (1.5 MB)

Bob_Carpenter · February 13, 2018, 9:31pm

Want to try to sit down together and try to track down the leak in the ODE solver?

Bob_Carpenter · February 13, 2018, 9:32pm

I just realized this code’s being refactored—maybe @wds15’s refactor fixes it.

wds15 · February 14, 2018, 9:41am

Which test is exactly showing problems?

The refactor is not yet in stan-math… someone needs to review it.

If quick to do for @seantalts, then a rerun of the sanitizer vs the refactored branch would be helpful.

However, it can easily be that our tests leak memory, but the code itself is OK (which still means we should fix the tests, of course).

Topic		Replies	Views
Address sanitizer unhappy on develop running simple benchmark Developers	1	397	March 18, 2019
Test coverage Developers	5	858	April 4, 2018
Awesome C++ Developers maintenance	6	1485	February 13, 2018
Test systems & supported compiler/os Developers math	9	668	May 1, 2018
Stan v2.17.0 Developers	8	1043	September 4, 2017

Running tests with sanitizers

Related topics