Continuing the discussion from Parallel autodiff v3 for the bits and pieces about map_rect
:
This statement will just work fine whenever MPI is NOT used.
We should fix that, sure; but it is not straightforward.
I don’t think that we can do anything at the parser level, because it depends on how you compile the Stan program.
Ok, so what happens? Well, with MPI on we want to be clever about the data. The goal is to distributed the data only once and then always reuse it. So whenever MPI is turned on then what will happen is:
- The data passed into
map_rect
is basically given a C++ type. Thus, the data is assumed to be immutable throughout the entire duration of program execution. That’s different compared the “data” keyword in Stan programs used to say “this is not an autodiff variable”. - Only the outer-most
map_rect
call will use the MPI backend. Any nestedmap_rect
call will not use the MPI backend (and as such the MPI data specialities do not apply). - If you have two
map_rect
calls one after the other (and these are not nested inside another), then bothmap_rect
calls will use the MPI backend.
Specifically now what will happen with this:
for (i in 1:2)
y = map_rect(f, phi, thetas, { { 1.0 }, {2.0} }, { { i }, { i } });
If MPI is not turned on (so we run serially or with threads), then we get
y = map_rect(f, phi, thetas, { { 1.0 }, {2.0} }, { { 1 }, { 1 } });
y = map_rect(f, phi, thetas, { { 1.0 }, {2.0} }, { { 2 }, { 2 } });
If MPI is turned on, then we get (it does not matter if threading is on - MPI takes precedence)
y = map_rect(f, phi, thetas, { { 1.0 }, {2.0} }, { { 1 }, { 1 } });
// the data is not checked again after the first invocation
y = map_rect(f, phi, thetas, { { 1.0 }, {2.0} }, { { 1 }, { 1 } });
Now, to make it even more fun, if the user writes in the Stan program
y = map_rect(f, phi, thetas, { { 1.0 }, {2.0} }, { { 1 }, { 1 } });
// this is now being correctly executed, since the stan parser will implicitly generate a new type of a new call => the change in data is correctly handled
y = map_rect(f, phi, thetas, { { 1.0 }, {2.0} }, { { 2 }, { 2 } });
You see, whenever MPI is turned on, then the data is taken to be immutable. Whenever MPI is not used, then these special considerations do not hold.
Does that makes sense? Any more confusion points?
I can think about a good doc for this - should I file a PR against the doc repo or what do you prefer?