I just got the performance test down from 1.67s to 1.11s through profiling, finding out that the hotspot was actually in
matrix_mat_vari::chain, and changing an explicit Eigen allocation to a C++11
auto. This is why profiling is awesome!
I will submit a PR with this to make sure it’s not crazy. Also I haven’t had any issues - I just put this in my
CXXFLAGS += -g
Recompiled things, and on a Mac, ran
instruments -t "Time Profile" ./test/performance/logistic
(9 there because this is my 9th invocation of the instruments profiler)
So excited - I think we can probably improve this particular thing over the codebase. With more performance tests, we can find more of these.