… I still can’t believe it, but I think my idea from the last meeting to use
thread_local storage for the AD stack just simply works. Using this I have implemented the
map_rect function using C++11
future facilities in a threaded manner. See this branch.
To actually see it running you can run the unit tests in
Again… all I am using at this point are plain C++11 language features which are part of the official standard. This will work even on Windows!
So much for the good news… the bad news is that the Apple
clang++ compiler creates a binary which fails to run the unit test. I had to use a g++ version 6 to make it work.
On preliminary examples I am seeing the usual speedups as seen with MPI. However, what is left to be done is to come up with a good thread pool implementation as from first tests it is clear that there is a lot of friction with thread creation and so on.
If someone else could confirm what I did is sound… that would be great, I still don’t trust it (even though I have seen exactly matching results, etc…).