I’ve done some stuff with google-perftools before: Adjoint sensitivities
https://github.com/gperftools/gperftools and for Ubuntu https://github.com/ahorn/benchmarks/wiki/Profiling-with-google-perftools
You gotta beware of inlining and optimization making the stack traces look be misleading, but I thought it worked pretty well. You can use google-perftools with about anything though, so a Stan binary should be fine.