Cholesky decompose test failing?


#1

Anyone seen this before? in AgradRevMatrix.mat_cholesky_1st_deriv_large_gradients:
C++ exception with description "cholesky_decompose: Matrix m is not positive definite" thrown in the test body.

From http://d1m1s1b1.stat.columbia.edu:8080/job/Math%20Pull%20Request%20-%20Tests%20-%20Unit/814/testReport/junit/(root)/AgradRevMatrix/mat_cholesky_1st_deriv_large_gradients/

And this from the console output:

[ RUN      ] AgradRevMatrix.mat_cholesky_1st_deriv_large_gradients
unknown file: Failure
C++ exception with description "cholesky_decompose: Matrix m is not positive definite" thrown in the test body.
[  FAILED  ] AgradRevMatrix.mat_cholesky_1st_deriv_large_gradients (39440 ms)

Apparently this test normally takes ~80 seconds to run so the time is not that weird…

It’s in a pull request where I changed only things relating to the distribution tests so I find this regression pretty weird, seems like it could be a spurious error I’d like to eliminate if anyone has any ideas.


#2

Just wanted to tell you that I don’t know what this could be off the top of my head. If it keeps popping up sporadically, I’d check for memory issues.

Does it pass locally?


#3

Passes locally and when run again on Jenkins :/


#4

I haven’t seen this before, but I’m fairly certain this is a test I wrote. How often have you seen the error?


#5

Just once that I’m certain of.


#6

And you haven’t been able to reproduce the error?


#7

Not yet.


#8

I’ve seen similar sporadic failures. I’d always assumed they were due to edge cases in the numerics that were getting hit by random testing.


#9

Are those tests random? If they aren’t, I would assume unsafe memory usage.

If this is happening on certain configurations, we saw something similar with Eigen and its initialization. If I remember correctly, on Mac, the default was Eigen set everything to 0, but in linux, it didn’t. That lead to failures when elements weren’t initialized to a value explicitly. (We fixed that with an Eigen typedef, but this is just what I tend to think about when there are random failures.)