ODE refactor now has a path forward

Hi all,

In Paris last week, Sebastian and I had a chance to sit down and hash out a lot of details. We didn’t get down into the implementation details, but we were able to sit and talk through enough of the current uses and enough of the future use cases to have a plan going forward. (The discussion wasn’t meant to be off-list. We just had some time and we were able to put each other on the same page in terms of use and design.)

Here’s the current doc:

I haven’t updated it yet, but will do today. It’ll have more of a design (C++ API design). After the design, we can implement it pretty quickly, I think. (I’m not sure I’ll have time to do it immediately, but with enough design, someone else can pick it up and run with it.)

I’ve pushed doc for the ODE system refactor. I don’t have nearly enough detail in there, so someone please add detail.

Next step is to create an issue (or issues) on the math repo to address the changes. We might as well put together a project for this too (let’s see how the new GitHub features work out).

I’m happy if we start building it up by having separate issues for:

  1. implement separate Jy and Jtheta (or separate issues for each)
  2. replace coupled_ode_system’s use of these things using Jy and Jtheta and rename it.
  3. replace cvodes_ode_system’s implementation with use of Jy and Jtheta

I think that’s roughly it. Any thoughts?

Hi!

Now I finally got posting permission: How about we go with the names

Jy -> jacobian_ode_states
Jtheta -> jacobian_ode_params

I think that will make it even more clear what is meant.

A further object I was thinking about to create was

ode_rhs_sensitivity

This object would create all the needed derivatives of the sensitivity states using as input the sensitivity states and the jacobian_ode*. This object would also contain the decoupling operation. For the boost odeint integrator and the CVODES integrator we would then create very light wrappers around ode_rhs_sensitivity. The odeint adapter essentially glues together ode_rhs and ode_rhs_sensitivity while for CVODES ode_rhs output is feed to a different function than the ode_rhs_sensitivity.

One question I still have is how to pass things around. That is, the greatest flexibility would be to use MatrixBase from Eigen as it allows to pass by reference views into sub-arrays form one to the next function. Are there other ways of achieving this, should something like this be avoided anyways or is there another way to do this?

To develop the concept further - is writing dummy code a good approach? By that I mean to write functions and objects without their bodies, but rather specifying their signatures.

Best,
Sebastian

wds15 Developer
September 28
Hi!

Now I finally got posting permission: How about we go with the names

Jy → jacobian_ode_states
Jtheta → jacobian_ode_params

I think that will make it even more clear what is meant.

I’d go with Jy and Jtheta in code—it follows
the standard convention of using J for Jacobians and our
convention of using y for states and theta for parameters.
But the longer form’s OK, too, if you find it clearer.

A further object I was thinking about to create was

ode_rhs_sensitivity

This object would create all the needed derivatives of the sensitivity states using as input the sensitivity states and the jacobian_ode*. This object would also contain the decoupling operation. For the boost odeint integrator and the CVODES integrator we would then create very light wrappers around ode_rhs_sensitivity. The odeint adapter essentially glues together ode_rhs and ode_rhs_sensitivity while for CVODES ode_rhs output is feed to a different function than the ode_rhs_sensitivity.

I think it’ll be more efficient if you do this with templates
and concepts (in the C++ sense of “concept”). Look at it from
the perspective of which services the two (or more?) integrators
need.

If the decoupling operations line up one-to-one with the Jacobian
operations, then you can put them together. But it’s often easier
to keep things minimal as it makes testing and composition into
bigger objects much easier.

What I want to avoid is having one big object that has functions
that may or may not work in various contexts (they’ll have to be
defined, but then set to throw exceptions or something when not
used).

One question I still have is how to pass things around. That is, the greatest flexibility would be to use MatrixBase from Eigen as it allows to pass by reference views into sub-arrays form one to the next function. Are there other ways of achieving this, should something like this be avoided anyways or is there another way to do this?

If it’s matrices, that’ll be more efficient than forcing copies
by using Eigen::Matrix. It’s also harder to code and test because
of possible instantiations and the fact that Eigen isn’t perfect
at implmenting everything for every expression template.

To develop the concept further - is writing dummy code a good approach? By that I mean to write functions and objects without their bodies, but rather specifying their signatures.

Yes. Outline the classes and signatures and any
inheritance/extension relations. If you draw that
out graphically, you get UML diagrams. But I don’t find
UML diagrams any more enlightening than graphical diagrams
of graphical models.

  • Bob

Hi!

Ok, I tried to write up dummy code which deviates in some minor points to what we had before. I have described right next to the code whats different. The code is on a conceptual level, but should address all steps required to get things going (at least I hope so).

I have not yet put up dummy test code along with it as I guess we first need to have a common understanding of what the code is supposed to do and then specify tests which should be straightforward I believe.

Some more questions:

  • Should I avoid using Eigen to calculate things in order to remove this dependency? I don’t expect a performance gain from using Eigen, but the code should get easier to read when using matrix algebra.

  • Since we are married to the vector stuff, I started to use iterators to gain some flexibility how stuff is moved around (i.e. writing calculation results to some memory chunks not owned by the function). I hope that is ok.

  • Introducing the iterators, I also went ahead and now write out things as Jy into a 1D array with the notion of a column-major storage. This is not nice, but all of this is a compromise to being married to std:vector and wanting to avoid heavily templated MatrixBase approaches.

Best,
Sebastian

wds15 Developer
October 3
Hi!

Ok, I tried to write up dummy code which deviates in some minor points to what we had before. I have described right next to the code whats different. The code is on a conceptual level, but should address all steps required to get things going (at least I hope so).

Home · stan-dev/math Wiki · GitHub

I have not yet put up dummy test code along with it as I guess we first need to have a common understanding of what the code is supposed to do and then specify tests which should be straightforward I believe.

Some more questions:

• Should I avoid using Eigen to calculate things in order to remove this dependency? I don’t expect a performance gain from using Eigen, but the code should get easier to read when using matrix algebra.

No. By all means use Eigen where necessary. I think
then it should go in mat rather than arr or scal directories.
Daniel?

• Since we are married to the vector stuff,

std::vector?

I started to use iterators to gain some flexibility how stuff is moved around (i.e. writing calculation results to some memory chunks not owned by the function). I hope that is ok.

Iterators are the preferred way to move things.

• Introducing the iterators, I also went ahead and now write out things as Jy into a 1D array with the notion of a column-major storage. This is not nice, but all of this is a compromise to being married to std:vector and wanting to avoid heavily templated MatrixBase approaches.

I’m not sure what you mean by being married to std::vector.
You should feel free to use Eigen wherever necessary. We can
talk about making the interfaces Eigen::Matrix types rather than
std::vector types if that’ll help.

  • Bob

Hi Bob!

Yes, I meant std::vector.

Ok, I take that I use Eigen to make code easier to read. The reason why I think we are married to std::vector is that the generated code from Stan creates a functor object which works with std::vector. We did say not to touch this part to keep this refactor simple from this end.

Moreover, using Eigen would introduce the template heavy MatrixBase stuff which is not so easy to test and as I understood is not ideal either. Hence, using iterators is a workable solution, I think. Note that I intend to use the iterators effectivley as pointers. So I would like to pass a begin iterator to a function which, for example, calculates Jy. The function will then write NxN doubles to that iterator as a sequence. If we are fine with that, then the current design should do OK.

Best,
Sebastian

Going back to Eigen: if it uses Eigen, it needs to move to mat instead of arr. At some point, we had users request that they be able to use the math library with the minimal dependencies. We decided that was a good goal to have, so we split things into scalar (uses Boost), array (uses Boost and std::vector), and matrix (uses Boost, std::vector, and Eigen). We also split it across the primitive (only uses C++ types), reverse mode (uses C++ types and stan::math::var), forward mode (uses C++ types and stan::math::fvar and only instantiates fvar<double> or fvar<fvar<double>>), and mix mode (uses C++ types, reverse mode, and forward mode). This second distinction is a strict distinction that we need to keep separate. It affects compile times for users.

Using Eigen doesn’t require templating on MatrixBase, so I think what you’re bringing up isn’t a direct concern. We don’t template on MatrixBase across the whole math library because it’s not a drop-in replacement for Matrix. There are certain behaviors, that were decisions by Eigen, that results in unexpected behavior. There are a couple threads on stan-dev with examples that show the behavior and we should really put that up as a wiki page on the math library. (I’m making an issue so we don’t forget.)

So… going forward, let’s just put up a test with interfaces and dummy implementations. See if we can get the design right that way. What do you think?

Hi!

Going with plain Eigen types as arguments of functions is not ideal either as I recall. Once you want to write out stuff to sub-expressions you either end up copying a lot of things around or you need the templated MatrixBase stuff. My idea now was to use std::vector as storage containers, pass around iterators to the right positions where we want functions to write there stuff to and if we want to use Eigen inside a function, then we just use the Eigen::Map functionality to map the raw std::vector based array to an Eigen type.

You appear to suggest to write first tests and then a dummy implementation… seems like I have clearly not yet arrived at a modern programming style (or a Stan style)… so I reversed the order. Here:

I wrote dummy code which shows how I think everything could work. My hope was that this is clear enough that tests are obvious to write.

Maybe we can still use the above as a start?

Sebastian

You can Map the data in a std::vector to a VectorXd
very easily if you need Eigen operations.

Where is the memory for the iterator being allocated and how
is it used in Boost’s Odeint and Sundials’ CVODES?

  • Bob

Hi!

Yes, this mapping was what I had in mind… you can even map stuff to a MatrixXd to my knowledge when you assume column major storage.

The natural space to hold the data for the states in a std::vector is inside the integrate_ode_* function. For odeint this is merely a std::vector instance. For CVODES this is a std::vector instance which owns the memory while you need to setup an additional NVector array instance which you can instruct to point into the std::vector (memory will be owned by the std::vector).

Sebastian

wds15 Developer
October 4
Hi!

Going with plain Eigen types as arguments of functions is not ideal either as I recall. Once you want to write out stuff to sub-expressions you either end up copying a lot of things around or you need the templated MatrixBase stuff. My idea now was to use std::vector as storage containers, pass around iterators to the right positions where we want functions to write

C++ lets you do this, but it’s dangerous because it breaks the
encapsulation of a std::vector by grabbing its mutable underlying
memory. If you grab a pointer to memory and another function calls
push_back, then you get inconsistent results.

So if you use this pattern, the scope of the std::vector should be very tightly
constrained around the call so the abstraction doesn’t leak.
You also need to reserve enough space so that the result will fit
in the std::vector’s iterator, which again breaks the nice encapsulation
of operations like push_back and at() and operator.

  • Bob

Hi Sebastian, sorry – no, it’s not clear enough.

Regarding your first bullet, I don’t see the use of iterators being that useful here. It just obfuscates things.

Regarding your second bullet – I don’t get it. The code has gotten much more complicated and I don’t know where all these functions / classes came from. For example, what’s ode_rhs_sensitivity, odeint_ode_system, and why are there three functions for jacobian_ode?

When we talked, we agreed that there would be:

  • Jy as a class with an implementation that uses autodiff
  • Jtheta as a class with an implementation that uses autodiff
  • boost_ode_system and cvodes_ode_system (or something that functions as an adapter to those systems). This would only take in the ode_rhs and then use Jy and Jtheta appropriately.

Then we’d change integrate_ode_* to use the correct system.

If your example code does exactly that, you’ll have to realign the names and functionality. I got lost in the many differences between what we discussed and what you put out.

(Sorry for being slow – I just can’t follow along that quickly.)

Hi Bob!

I am passing iterators into std::vector around - you can never call a push_back against this, nor call the at operator. You can call the [] operator if you cast the thing first into an array. And yes, when I cast the iterator to a pointer and then treat this thing as a chunk of memory, then this needs to be done carefully. The reason why this is possible in this situation is that we are actually abusing a std::vector here. That is, in principle all the arrays sizes are known at compile time! The number of states N, number of variables P and the number of sensitivities S is all fixed at compile time - using now a std::vector is a big sacrifice to performance, but well justified from the structure we have to live with. If we could determine those sizes at compile time, that would give us a performance gain, I think.

Best,
Sebastian

Under what situation is this a big performance hit? Relative to the cost of
computing a Jacobian, having statically sized std::vectors shouldn’t matter
and I would prefer maximum clarity here. We really don’t need to be
innovative here. It only makes it harder to maintain.

Hi Daniel!

When spelling this thing out into dummy code I had to rethink through it and made changes following what I thought made sense.

Let me add a glossary of terms at the top the dummy code section and add a few more comments. I hope these help. If not, detailled questions where I loos you would help me - or we discuss at one of the next meetings (this Thu won’t work for me to join the Stan meeting).

Sebastian

Hi!

When I benchmarked versions of similar code to what we write right now, I was able to notice additional cost due to copying things around (a few percent in speed). It may not be a huge thing and if we decide that clarity goes over performance, fine with me, it just feels unnecessary to me. Moreover, when the user plugs into the system analytic Jacobians, then the copying cost becomes larger on a relative scale.

Sebastian

You can’t call a push_back on the iterator, but you can on the
std::vector. That’s what I mean by the iterator breaking the
std::vector abstractions—as long as that iterator is around,
you can’t do normal operations on the std::vector without potentially
invalidating the iterator.

We’ll never be able to figure out sizes at compile time and it’s not
worth it beyond small sizes anyway.

The nice part about std::vector is that it encapsulates the memory
and makes sure you don’t get leaks. But, if you start playing with
iterators, it can leak if you aren’t careful to limit the iterator’s
scope to where you know it’s valid.

  • Bob

Ah, I see. The logic would be that the integrate_ode_* functions create the std::vector once and then the size is never changed any more. All the downstream objects which get iterators would hence always have a valid iterator to deal with. Deallocation will happen whenever ODE integration is over and the integrate_ode_* function terminates.

Sebastian

Thanks! Sorry about the confusion. Glad you’re taking steps ahead – I just
need enough info to help keep up.