MPI shard scaling

Thank you all for your replies!
I finally understood your point on handling the data.

The solution to my initial problem was pretty dumb, it was just an MPI configuration error which caused all processes to spawn on a single node (if anyone has the same problem, you can check that by running mpirun -np 20 hostname)

Fixing that I now get much more reasonable runtimes:
comparison_plot.pdf (6.2 KB)

This is for the Warfarin model here: Map_rect threading.
nxalien2 is a single machine with an intel xeon E5-2630 v4 (10 cores, 2 threads),
pool are pretty old Intel 3770 (4 cores, 2 threads) and superpool are newer intel 8700 (6 cores, 2 threads).

Out of interest, what is the setup you used for your MPI benchmark in the link above?

Again, thank you all for your kind help
Daniel