Why do you think they’re necessary? So there aren’t multiple function calls?

This is a line I’d really rather not cross. It completely upends the semantics of Stan as it stands now.

I’d rather wait until we have a proper tuple/struct implementation (maybe a year out).

Not really. We took a heavily safety-first approach. But I think we’re now a lot better at C++ than we used to be, so it’s wroth re-examining. In-place operations and noalias can be tricky to work together. It’s precisely that x *= y case that’s so tricky when x is part of y.

This needs to go along with auto intermediates (though those are also tricky in Eigen) to prevent eager copies of expression templates and also more generic arguments and better code generation at the C++ level for Stan models.

Well, I have written the prototype with maximal speed in mind. When now moving to a design with by value return of the Jacobian we will see how much this is of a penalty.

The noalias statement made the code noticeably faster. We should really then check the codebase for opportunities to given Eigen this hint.

Is it possible to use it somehow? Soon I should have another large ODE problem (this time with proteins) so would love to try the speedup out. I was just thinking that it could be approached similar to MPI - one has to compile cmdstan to enable MPI using MPI switch in make. Here we could use ODE switch.

I think we already agreed on another function in stan like

integrate_ode_rk45_jacobian

or however we want to call it… we could even just add another overload to stan-math like this

integrate_ode_rk45 simple with fixed default tolerances

integrate_ode_rk45 with optional tolerances + max_steps

integrate_ode_rk45 with optional tolerances + max_steps + Jacobians

How about that?

@linas : To use my code… just grab that branch from the rstan repo. It should work with the current 2.18 cmdstan and rstan 2.18.2. The speedups are really cool I have to say.

(the biggest issue with the new functions is my own time resource, but may @syclik can give a hand?)

Perhaps I am not doing the right way: I have one 2.18 cmdstan in folder cmdstan compiled with default make and the same version of cmdstan in folder cmdstanMPI compiled with MPI. I thought about having folder cmdstanMPIODE compiled with MPI and analytic Jacobian. From R it is very easy to switch to required mode (normal, MPI, MPI with ODE) as well as substituting appropriate make/local file. While in theory testing has to be done for all combinations I think that not all combinations are used - bit of engineering approach…

Yes, a function like that makes sense. Just bear in mind that we need two jacobian functions, I think. One for the Jacobian wrt to the states and one wrt to the parameters.

By now I have compared a version of my prototype using passing things by reference vs passing by value. The result is 74s (by ref) vs 123s (by value)… which I find quite a lot of performance to pay for copying stuff around. To me this raises some concerns as to how useful such an extra function in Stan really is given that you can get so much more performance with a bit of C++. Is by reference passing at some point in time realistic in the Stan language?

Also, if we would add such functions, then we should probably think a moment what to refactor first. For example, the state decoupling can be pulled out of coupled_ode_system (and maybe the decoupling could be done during integration and also use en-block memory allocation?).

The comparison is between these codes for the by ref and by value thing:

I am quite amazed how much faster by reference here is!

I have to solve 50 something ODE. Could you point me to the version that has the code for analytic jacobian as well as an example how to supply it to stan - via C++ or via stan language. I would like to use MPI if possible. Can you please help me?

Thanks a lot

Actually since it is one patient system I really don’t know how if it is possible to split data into shards. The objective - we are working on Bayesian inference using optimization - and the approach is quite different from all approaches based on optimization (expectation propagation, variational Bayes, etc.). - so hope is to arrive to some alternative to sampling based approaches. We will treat stan results as a gold standard and see if optimization based technique arrives to similar joint posterior. Any advice on comparing different techniques is also appreciated.