This would be super cool!
As a start we should output somewhere the size of the AD tree, which we do not do right now as I recall. The size of the AD tree is a good measure of the expensiveness of a Stan program. That can be added to the cmdstan diagnose output very quickly.
A few thoughts on what I would do differently to what you proposed:
- don’t require start / stop. Instead just let users declare a profiling object which they name and the scope of the object determines the range of things being captured. Basically a RAII type of working. Maybe make the object go out of scope at the end of each block should that be possible.
- Either use the name of the objects as suggested or use the Stan model line-number as the ID of the profiling block. I would prefer automatic names; at least as a default.
I really think we need this to help users navigate their performance issues.