For MPI we have now a proposal to manage immutable data in a distributed way. From Bob’s blog post I concluded that this could also be of interest to GPU computing if I got his comments right. Do we need to consider this in some way now in the design?
I can well imagine, that for GPs (for example), it would be quite attractive to copy the immutable data to the GPU just once and then reuse it for each iteration. That may give another performance bump.