Vectorised unary functions - Eigen implementations?

andrjohns · December 11, 2018, 7:51am

For some of the vectorised unary functions in prim/mat (i.e. those using apply_scalar_unary), it looks they could instead be implemented using Eigen’s coefficient-wise tools: https://eigen.tuxfamily.org/dox/group__CoeffwiseMathFunctions.html

This could give a bit of a speed boost since some operations support SIMD instruction sets (parallelising the operation within a single core, kind of).

Should I create an issue and look at implementing these, or is apply_scalar_unary preferred here?

wds15 · December 11, 2018, 8:11am

Go ahead!

… but I recommend to benchmark your changes to make sure that the additional complication of the code is worth it as compilers are good at guessing stuff… you will want to get familiar with the performance cmdstan git repo.

andrjohns · December 11, 2018, 9:17am

That’s a good idea, will do thanks

andrjohns · December 13, 2018, 2:46pm

Really interesting stuff here. I coded up some rough benchmarks comparing the performance of the log and exp functions.

Log showed some minor improvements in speed, but nothing particularly impressive:

log 
1000 randomly generated 10000 x 1 Vectors (milliseconds)
stan: 159
eigen: 148
1000 randomly generated 1000 x 1000 Matrices (milliseconds)
stan: 16508
eigen: 15180

Exp on the other hand showed a dramatic improvement:

exp
1000 randomly generated 10000 x 1 Vectors (milliseconds)
stan: 75
eigen: 22
1000 randomly generated 1000 x 1000 Matrices (milliseconds)
stan: 8644
eigen: 2395

I would guess that the difference in improvement is because Eigen’s exp supports AVX for doubles, whereas log only supports AVX for floats. Either way, interesting to see the performance boosts that are available.

Benchmarking code below, compiled with:

g++ -std=c++1y -march=native  -O3  -I . -I lib/eigen_3.3.3  eigen_log_exp_test.cpp

eigen_log_exp_test.cpp (3.9 KB)

(warning will use up approx 8gb RAM)

Topic		Replies	Views
Vectorization of exp and log Algorithms	6	1352	March 20, 2020
Vectorised AD testing - Nested containers Developers	7	664	January 5, 2020
Stan SIMD & Performance Algorithms	23	4567	January 23, 2020
Stanc3 question: Vectorization of loops Developers	6	507	January 28, 2020
Eigen mixed type binary operations Developers	4	676	February 9, 2018

Vectorised unary functions - Eigen implementations?

Related topics