Thanks for the link to the actual proposal.
This is a bit different than what I’ve called “vectorization”, which is itself different than what the CPU/compiler calls vectorization. By “vectorization” in Stan, we have two ambiguous meanings:

applying a scalar function elementwise; this can apply to any old functions, like generalizing real sqrt(real)
to vector sqrt(vector)
or generalizing vector softmax(vector)
to vector[] softmax(vector[])
.

allowing scalar functions to be generalized to containers in reductions; this is something like normal_lpdf(real  real, real)
being extended to normal_lpdf(reals  reals, reals)
where any of the arguments can be real
, real[]
, vector
, or row_vector
.
The original distinction among std::vector
, Eigen::VectorXd
, and Eigen::RowVectorXd
was intentional, not the result of laziness. I wanted to avoid the confusion you get in R about the type of a linear algebra result and allow minimal type returns for functions like real multiply(row_vector, vector);
I wanted arrays to be use for holding elements in a container and linear algebra types to be used whenever there was a proper multivariate function (like in softmax
or log_sum_exp
, which do not operate elementwise). That got relaxed as I seemed to be the only person happy with that distinction.
There’s even been calls to eliminate real[]
from the language. I like the generality of saying our basic types are real, int, vector, row_vector, matrix
and that we can form arbitrary arrays. It’s very simple to explain and document, though apparently very confusing to a lot of Stan users coming from less strictly typed languages like R or Python.
We are never going to be able to get to where vector
, row_vector
, and real[]
are exchangeable, nor do I think we should aim for that.
This approach assumes that the combination of Eigen::Map
+ Eigen
functions is faster than using standard library functions.
Why would Eigen::Matrix<T, R, C>
containers be faster than std::vector
? They’re built with the same RAII pattern, though initialization’s a bit different. I don’t know if the compiler designers optimze the standard library beyond what’s possible for Eigen as an external lib. Both Eigen::Matrix
and std::vector
guarantee memory locality on the values. Is there an example somewhere? I would think rather than using Eigen’s templated operator
, we’d want to get Stan’s version in place that uses custom gradients and can remove a bunch of virtual function calls using a single callback chain()
method for all entries.
Eigen::Matrix<T, 1, 1>
is different than std::vector<std::vector<T>>
in that the matrix uses a single array of contiguous memory and stores in columnmajor order. That means accessing rows of vector<vector<T>>
is much much faster because no copying is required. Traversing columnmajor will be faster with matrices.
The bigger question I have is what’s the return type of something like softmax()? It sounds like the proposal uses a uniform return type, which in Stan works out to these three types:
vector softmax(vector);
vector softmax(row_vector)
vector softmax(real[]);
If we do extend these functions, wouldn’t it be more natural to have each return its own type?
For head
, we absolutely need to keep our input and output types, so that’d have to be:
vector head(vector, int);
row_vector head(row_vector, int);
real[] head(real[], int);
int[] head(int[], int);
Given the proposed definition of log_softmax
, it’ll still only apply to Eigen types as they’re the only ones that support an appropriate .array()
method. The document says that Eigen::Map
will be used around std::vector
, but I don’t see where that happens in the code examples.
Adding all these lambdas isn’t free. Their constructors and destructors will get called and assignments will be made. So I would suggest profiling the resulting functions, which will probably be dominated by exp
and log
anyway.