Multiple MPI communicators


@Bob_Carpenter, @wds15, I wonder under current map_rect design what’s the best way to define multiple MPI communicators. For example in a population model, for each individual how to allocate a dedicated communicator.


What do you mean? What are communicators?


An MPI Communicator defines a communication arena in which different computing nodes can pass information (roughly speaking). What I’m asking is essentially the first step of how to achieve inter-communication.


Not sure if I get where you are coming from nor where you are heading… for a hierarchical model with J subjects, we ask the user to structure the data such that the first dimension of the arrays runs from 1…J - this codes what is considered as a unit. The those J units are dispatched onto the M nodes (including the root) in equal work junks of size J/M. That’s it. So all what we do we with MPI is broadcasts/scatters/gathers command with the root node as the source of work and sink for the results.

Not sure if that answered your question though.


With J subjects in a hierarchical model, suppose in addition to the M nodes, one has another J*N nodes and he wants to use N nodes for each of the J subjects. That means there are 1 + J communicators in the run, 1 for the M subjects, another J for each subject. How to, or is it possible, to achieve that using map_rect?


So you mean more than 1 MPI node per subject? No that is not possible at the moment.

… what I have planned to do is to combine MPI with threading. So say we have M MPI nodes and the chunk-size J/M = C. Then on a given node the C units can be processed using threading. This should give major speedups as we limit the use of MPI and combine it with threading.

I am not sure what applications need more than 1 core for a given subject. In the current design using more than 1 CPU per subject through MPI is probably best done through threading which should anyway be more efficient?


Yes, that’s what I’m asking about.

Many things, including any large-scale linear algebra.

I understand MPI+threading has been a popular setup, but with MPI-3 this won’t be necessary, as MPI-3 already supports SMP with shared memory.


Getting MPI into Stan will be such a relief…what you target sounds like the next gen barrier.

Maybe… but MPI is a huge pain to program while using threading via C++11 facilities is far easier. OK, this argument does not apply (that much) to off-the-shelve libraries which are already written.


So I guess map_rect uses a single MPI_COMM_WORLD. Is that correct?


Probably yes… I work with boost::mpi to escape from this MPI low-level stuff as much as I can (such that I am not 100% familiar with all the low-level things).


No problem. Thanks.

I think we may want to document MPI communication info(if not yet), as well as printing out debug information such as communicator name, size, time, etc.