I was going to answer one of the other threads but I went out with coworkers for drinks tonight and I decided instead to post a little thing about a solid win that I think went under the radar
The inverse calculation on the GPU is wildly fast!
@rok_cesnovar posted this on the inverse PR, but I thought it would be nice to repost here.
For the CPU I ran the tests on a i7-4790 CPU @ 3.60GHz
Speedup for the Titan XP
So for a titan XP, a desktop GPU, we top out at about 45x relative to the CPU version, pretty nice!
For the V100, a more hardcore / scientific GPU, we get the below
So even at 10K size matrices it looks like this still had a lot of power to churn!