I want to summarize where we’re at in terms of proposals.

For each of the three concrete proposals, I provide the signature of the `map`

function as it would look from the Stan language and the definition of the signature for the function argument `f`

.

In all cases, the rough structure is a map-reduce that follows the actual map function being defined in the Stan language:

#### Rectangular, retransmit data

```
vector[] map(F f, vector[] thetas,
vector[] xs_r, int[,] xs_i)
vector f(vector theta, vector xs_r, int[] xs_i)
```

*PRO*

- function is simple and encapsulated
- clean math library implementation
- doesn’t go beyond what already exists in Stan language
- could use standalone function implementations as is for the workers (non root)

*CON*

- retransmits data
- rectangular only, so may require ugly and expensive-to-transmit padding

#### Ragged, retransmit data

```
vector map(F f,
vector thetas, int[] theta_ends,
vector x_r, int[] x_r_ends,
int[] x_i, int[] x_i_ends)
vector f(vector theta, vector xs_r, int[] xs_i)
```

*PRO*

- all of the pros of the rectangular version
- function is same, so still encapsulated

- allows arbitrary raggedness (user must know ragged result bounds)

*CON*

- retransmits data
- awkward raggedness without built-in ragged structures
- return sizes implicit
- could
`int[] result_ends`

argument for anticipated sizes—check results match

#### Rectangular, Fixed Data

The data variables `x_rs`

and `x_is`

must be the names of `data`

or `transformed data`

variables, so that they can be loaded once and reused

```
vector[] map(F f, vector[] thetas, vector[] x_rs, int[,] x_is)
vector f(vector theta, vector x_r, int[] x_i)
```

This one has a map that just sends the data in once, but `x_r`

and `x_i`

must be the names of variables in the data/transformed data block. Each child process will need to load the data from the model, but then only needs to hang on to its own slice, `x_rs[k]`

and `x_is[k]`

. Then the function can be taken from the standalone function compilation.

*PROS*

- reads data once on each child process
- easier to implement than closure
- easy implementation without MPI
- should perform better than data transmission each time

*CONS*

- has to instantiate all of the model’s data on each process, even though it only needs a slice
- back of envelope, this should be OK for PK/PD apps
- might not be so OK for distributing big regressions

- function
`f`

needs to know how to grab its slice of data from the index

Could generalize this to ragged in the same way as before.

#### Closure, send data once

```
vector map(F f,
vector thetas, int[] theta_ends)
vector f(vector theta, int fold);
```

*PRO*

- data closure means no data arguments
- could be big win in reducing communications (latency still there in calls and overhead for waits)

*CON*

- requires functions that are closures over data
- requires major extension to Stan language parser and code generator for support and a lot of doc explaining them
- generate as member functions rather than static
- might like to have in language anyway

- could require too much memory per worker as each worker gets everything in the instantiated model