We talked a lot in the meeting today about, essentially, the need for further automation and test coverage. I started looking into a few things we can do to reduce the number of bugs. Pretty high on my list are clang-tidy (which I am extremely interested in using to modernize our source code, maybe even automating that, but after MPI and GPU stuff lands). The second thing is running our unit tests with Clang’s sanitizers. There are two, and I started running them with just the Address sanitizer first and got this error message:
This seems maybe bad? @Bob_Carpenter any thoughts? It’s during a test that is supposed to be recovering memory, but it seems to be reading at an inappropriate offset into an allocated block…
One problem we have is that the autodiff stacks intentionally “leaks” memory in some way in that it’s never collected until there’s an explicit call to clean it up.
Do things like this show up outside of that ODE solver? If not, then I’d be very worried about allocation inside the ODE solver.
I already don’t recognize our source code! Seriously, though, I’d be all for modernizing it as long as we stay within the bounds of our supported compilers.
Out of all of test/unit/math/rev, the only place the sanitizer finds a problem is in this StanAgradRevOde.memory_recovery_dv test. Full log attached. sanitized.txt (319.0 KB)
I ran the full unit tests with the sanitizer and there are a couple of other errors but they all happen after tear down, which makes me suspect there’s a higher chance of them being spurious than the ODE one, which happens during a test run. full log: sanitized.txt (1.5 MB)