Hi Stan Dev team -
I am reaching because I have found
reduce_sum to be incredibly helpful for fitting big models, and the only limitation is the threaded parallelization that means I can’t use more than 1 machine on my uni’s cluster.
I understand that extending reduce_sum to use a more general interface like MPI would be no mean feat, but I wanted to get an idea of just how hard it would be and whether I should consider applying for grant funding for a programmer (post doc?) to try to tackle this task.
map_rect is difficult to use for the latent variable problems that I primarily work on that have lots of irregular arrays.
Thanks much. for any general ideas… and if it’s technically infeasible, great to know too :).