MPI Design Discussion

Okay, I think I understand a lot more now than when I started. Within this map_rect API, here’s what I might like to see in the design:

low-level independent functions:

  • flatten a vector<vector<T>>
  • unflatten a vector<vector<T>>
  • Send data to all nodes (maybe this is just broadcast) (not cached)
  • cached scatterv
  • maybe even cached scatterv for a doubly-nested vector (cached)

map_rect_mpi would look something like:

  1. broadcast shared params & data
  2. map across thetas
  3. var-related memory management (not sure where this goes exactly)

map looks like:

  1. reduce(std::vector.push_back composed with F, …)

Reduce looks like:

  1. distributes data to map over with cached scatterv
  2. Applies reduction op
  3. gathers results with gatherv
  4. Finish reductions by applying reduction op to gathered data on root node

reduce functor for map_rect might involve separate entry and exit “context manager” style fixtures as independent functions or classes:

  1. start nested & un-nest / cleanup autodiff
  2. perhaps the var memory stuff can go here?

Does this make sense? I’m happy to provide more example code as well if that helps. I was originally starting to do some refactoring on my own as its own example but figured I should put this out here first.