Hi Bob, thanks for the feedback!
I wanted arrays to be use for holding elements in a container and linear algebra types to be used whenever there was a proper multivariate function (like in
softmax
orlog_sum_exp
, which do not operate elementwise). That got relaxed as I seemed to be the only person happy with that distinction.
I see where you’re coming from, but unfortunately the codebase is in a bit of a limbo state where some math functions are implemented for arrays (e.g. exp
, log_sum_exp
), but not others (e.g. log_softmax
). So for consistency, we need to either extend these functions for arrays (in the cases where we don’t need to assume that the array is a column or row vector), or to not have math functions defined for these at all.
Why would
Eigen::Matrix<T, R, C>
containers be faster thanstd::vector
?
This is less about the speed of accessing elements and more about the mathematical functions themselves, since Eigen employs SIMD vectorisation and lazy evaluation for its functions. These kinds of performance differences are being seen over in this pull, where Eigen’s vector functions are out-performing the existing element-wise functions.
The bigger question I have is what’s the return type of something like softmax()? It sounds like the proposal uses a uniform return type
The proposed framework returns the same type as the input, which looks like:
vector softmax(vector);
row_vector softmax(row_vector);
real[] softmax(real[]);
For
head
, we absolutely need to keep our input and output types
That’s how it’s currently implemented
Given the proposed definition of
log_softmax
, it’ll still only apply to Eigen types as they’re the only ones that support an appropriate.array()
method. The document says thatEigen::Map
will be used aroundstd::vector
, but I don’t see where that happens in the code examples.
The Eigen::Map
happens in the apply_vector_unary
struct. So the function itself only needs to be defined for Eigen types and std::vectors
are automatically mapped before the function is applied.
Adding all these lambdas isn’t free. Their constructors and destructors will get called and assignments will be made. So I would suggest profiling the resulting functions, which will probably be dominated by
exp
andlog
anyway.
That’s a good idea, I’ll have a look into the performance testing side of things.