Profiling the execution of a model

Is there any option inside CmdStan to profile the execution of a model? Any easy way of seeing which parts of AD are the bottlenecks etc.

If there is not, does anyone have any ideas if this is possible at all and how?

So far the best solution I have is simply brute forcing a bunch of time measurements and prints in all the functions that I presume could be bottlenecks. I also measure the time of one transition (here) and if the time execution times of the functions add up to the execution time of the transition or something close to it, I know I have found all of them.

I’ve done some stuff with google-perftools before: Adjoint sensitivities

https://github.com/gperftools/gperftools and for Ubuntu https://github.com/ahorn/benchmarks/wiki/Profiling-with-google-perftools

You gotta beware of inlining and optimization making the stack traces look be misleading, but I thought it worked pretty well. You can use google-perftools with about anything though, so a Stan binary should be fine.

1 Like

This thread has some generic suggestions, but I know you were seeing all of the virtual chain() calls being identified as the same call and I’m not sure how that works. I was seeing that with -g (in this post in that thread) that it was identifying at least which class’s chain methods were being called most.

Thanks to both of you. I will report back.