That’s probably a good idea, maybe we can explain it to Bob and others and see what they think about the tradeoff of getting something specific to hierarchical models in with and without more general abstractions underneath. FWIW I think we could do it without sacrificing too much performance (tuples are compile-time constructs) - it’s more of a code organization and re-use thing. I am optimistic that we will soon have many more use-cases for parallelism in Stan :)
- I was wondering where the cache is going to be used? It isn’t used in the code as presented so far, not sure if that was an oversight or if it will be used somewhere else in the future?
1a I would template the
mpi_parallel_call_cacheon the data types, not on the parameter types - these should be the same in all runs right?
- The design can implement anything we throw at it if we template it - we don’t need to have separate arguments just for these things that are specific to hierarchical models out in the general purpose building blocks we’re making underneath. I think given that choice it’s good to have something more general underlying if we can do it without too much performance loss.
- Is there some way to avoid adding another function to the
ReduceFtype? Run it the first time with a dummy or something?