MPI roadmap

Hi!

MPI is to me one of the greatest features to come to Stan and I think we are ready to make it happen soon. Here is my proposal for moving forward in a step-wise manner:

  1. map_rect function should by now be final unless @Bob_Carpenter has any additional comments/concerns. The serial version should be included into stan-math asap. issue #686 on stan-math.
  2. map_rect exposing to the Stan language. This requires the parser to handle the user supplied function in the usual special treatment. issue #2440 on stan
  3. Inclusion of MPI base system which includes only the basic mechanisms to send commands over the network. This should include a solution to test MPI code. Hence, I would expect that this pull also comes with changes to the test system to make it possible to test MPI enabled code.
  4. Inclusion of MPI enabled map_rect in stan-math
  5. Inclusion of further parsers changes needed to make things work (boost macros to ensure that MPI communication can take place). This needs to go into Stan.

Any comments on the above would be appreciated; I hope these steps are clear in what they mean. Let me know if not or if we should split things up further.

Best,
Sebastian

I should be able to knock off (1) and (2) after 11 December. Shouldn’t take long at all.

As for (5), is there a way we can do this without having stan-dev/stan depend on MPI? That is, can the MPI all be hidden in the math lib?

The MPI code can all be hidden in stan-math, yes. However the stan parser needs to generate additional code which is required to register the user supplied functions with the boost serialization library. The extra generated code can be placed inside a #ifdef STAN_HAS_MPI block to avoid additional MPI dependencies when compiling without MPI. See here for the needed definitions:

https://github.com/stan-dev/cmdstan/blob/feature/proto-mpi/examples/mpi/mpi_defs3.hpp

There is one bonus complication here, which is that the types registered with the boost serialization library must not be longer than 128 chars which is a limitation imposed by boost serialization (this is why I defined a mpi_call struct outside of the model namespace in the snippet above).

Bump. I just figured how we can hide most of the needed declarations in neat C preprocessor macros. So all what needs to be generated is

STAN_REGISTER_MPI_MAP_RECT(mpi_call, double, double)
STAN_REGISTER_MPI_MAP_RECT(mpi_call, var, var)

and in addition a struct with a short name in the top-level namespace to make sure that we don’t blow the boost serialization limitations. I hope everyone is fine with having C preprocessor macro definitions in stan-math headers…

Hi Bob!

One way which would allow to get rid of all dependencies of stan from MPI stuff is to work with macros. So the stan parsers generates for each map_rect call it sees a call to a macro, let’s say STAN_REGISTER_MAP_RECT.

The definition of that macro will be done in stan-math. Depending upon availability of MPI different things will be defined for it. The non-MPI version will probably do nothing while the MPI version will register all necessities with the boost serialization system.

Sounds like a plan?

It’d be ideal if we can get everything to compile without MPI.

A macro solution sounds reasonable to me. I don’t know that we have many alternatives if we want a solution that compiles without compiling MPI.

The only other alternative would be #ifdef-s and properties. @seantalts and @syclik are probably in a better position to understand the build issues.

1 Like

I don’t know what properties are, but to me the macro solution is fine. The macro solution comes at the burden to include headers a controlled order.

An alternative to macros would be to translate stan files into two separate header files. The first one being the usual one and the second one a header file which adds the MPI specific bits needed for compilation. The MPI enabled cmdstan would then need to pick that up in some clever way.

I just mean environment variables passed into the build. For example,

#ifdef FOO
...
#endif

I’m talking about FOO. I never know what to call those.

ah. Ok, this is what I did. I used properties. So I introduce STAN_HAS_MPI as compile time property in that terminology.

Too many threads open, so I’m not sure where I should be commenting.

I found the implementation of map_rect in branch feature/issue-686-map_rect_serial and was wondering about the signature, which if I understand it correctly, will look like this in Stan:

vector map_rect(F f,
                vector shared_params,
                vector[] job_params,
                real[ , ] job_data_real,
                int[ , ] job_data_int);

where f is a function defined in Stan to have a signature as

real f(vector shared_params,
       vector job_params,
       real[] job_data_r,
       int[] job_data_i);

It seems odd to have the parameters be a vector type and the data be array types. Of course, the integer data must be an array, so no choice there.

In the other signatures, such as for integrate_ode, everything’s come in as standard vector types, so both shared_params and job_params woul be real[], just like job_data_r.

Anyway, I’m going ahead with it as written, but was wondering if the two different types were intentional or if we should change these to match the style of the ODEs. It’ll be easy enough to change.

I feel the same way.

@wds15, what’s next? The boost headers are in the math library now. Build in the math library?

Great that the mpi libs are in!

Next is the build stuff. Maybe you can comment this thread?

If you are good with that, then I will file a pull with it.

In parallel we can work on the map_rect serial stuff. Let me answer to Bob above.

I thought about this one a bit and made the choice for the current design for these reasons:

  • the ODE functions would suggest to use nested arrays for parameters and data, yes; however, the algebra solver uses Eigen vectors for the parameters and the output of the function. So the algebra solver is inconsistent with ODE stuff and I figured the algebra solver is the more modern one, I thought.

  • The code will likely get used a lot in hierarchical models. Then you will very likely like to put a multi-variate normal prior on the job-specifc parameters. The array of vectors (vector) allows you to do just that.

  • Efficiency: Internally I do cast everything what is send over MPI into bigger Eigen matrices. Having the user-supplied per job function take Eigen vectors as input and getting Eigen vectors as output reduces the need for converting arrays into vectors and vice versa a lot. The reason to only ship Eigen matrices over MPI is driven by the observation that this gives much better performance. MPI can deal with continuous chunks of memory much better than with anything else (with nested arrays you have no guarantees on memory layout). Using Eigen matrices makes my life a lot easier when coding all of this.

  • the data arguments need to be nested arrays as I want to process int and real in the same way. As integers can only be handled by nested arrays, I then have to also do the same with the real data. Doing the conversion of these into Eigen structures only needs to be done once such that the array to Eigen and Eigen to array conversion only happens a single time.

I hope that are enough reasons; would you agree?

That’s the part I don’t get. Why not make the real data vector[] instead of real[ , ] for all the reasons you give above?

I put it on the meeting agenda for tomorrow :-)

@Bob_Carpenter, @syclik, @seantalts bump… MPI build and map_rect serial are out for review as pulls.

We are moving.

The MPI resource management pull is up. Right now it’s still WIP since the MPI build stuff has not been merged yet, but if you want to get an idea of the MPI resource handling which I am planing for, then have a look.