Stan-math: Eigen support

I am interested in using stan-math for automatic differentiation (and maybe even the optimizer, if i can figure out how to use it).

I have some existing code written with Eigen::Tensors and would like to differentiate it automatically. I can change the types from Eigen::Tensor<double, n> to Eigen::Tensor<T, n> and it will work.

I gather from the stan math arXiv paper that this is a memory and time inefficient way of doing things. It creates many more nodes than necessary for things like summation/dot product etc.

So stan-math gives specialised functions to deal with these operations. These are unfortunately geared towards Eigen::Array and Eigen::Matrix not Eigen::Tensor. Also, it means rewriting existing Eigen code into a stan-math code, so it’s less automatic.

I understand that the design of stan-math is driven by the modelling language (which I know nothing about).
But is the stan-dev team interested in introducing tensor building blocks and overloading Eigen’s functions/operators on that level?

We’re happy to take contributions! Hopefully you were offering some help.

With Eigen::Tensor, it’ll be a little tricky because it’s still experimental. Given that it’s not fully supported by Eigen, it adds a maintenance burden on Stan Math every time the interface changes.

If you’re willing to contribute, we’ll happily guide you through the process and help you get started.

Hi syclik,

Thank you for the warm welcome! I’m flattered that you’re suggesting that I could contribute. Unfortunately, I’m still at the stage where I close my eyes and pray my code compiles every time I use templates.

My question is really whether the current developers have any intrinsic interest in pursuing such an enhancement to the library. “No, we don’t” is a perfectly acceptable answer. The one I was expecting is that it’s an extremely hard problem due to Eigen’s views, lazy evaluation etc.

What do you think?

Ah… I still have those days!

The answer’s just pragmatic: we’re short on time. We really welcome contributions to any part of the Stan infrastructure.

This particular request is difficult for the current team to tackle because the Eigen::Tensor module is still unsupported. It’s pretty tricky to keep on top of Eigen with the slew of compilers our users deal with (we end up pushing on compiler bugs). Adding an unsupported, moving target to the mix makes it much more difficult without additional help.

Just curious, are there easy wrappers in Eigen that translate between Tensors and Arrays / Matrices?

Forgetting about tensors for a moment, stan-math doesn’t have overloading of functions/operators on the Eigen::Array/Eigen::Matrix level. So if I have a function

double dot_product (const Eigen::VectorXd& a, const Eigen::VectorXd& b) {
    return a.transpose() * b;

There’s no stan::math::Vectorvar class to overload this function with. It would be a class that implements transpose, * etc. Does this make sense?

As for Eigen::Tensor, I know it’s officially speaking “unsupported”, but I’m under the impression that Tensorflow is built on top of it (please correct me if I’m wrong), so the interface can’t be that volatile. The wiki page is admittedly terrible, but there is some decent documentation available.

Eigen provides a Map class that let’s you view data from a potentially different container as an Eigen::Matrix/Array. (It can also be used to “reshape” data that’s already inside an Eigen object). As a special case you have Tensor -> Array/Matrix conversion.

There’s also a TensorMap class that does the equivalent for tensors. So that gives you the other direction.

That’s right. And there’s a good reason for it. Eigen’s not implemented such that there’s a base template expression that allows us to write something general for both Eigen::Matrix and Eigen::Array. We only found that out by implementing a few functions and finding answers to be wrong. Posting on the Eigen list, we found out that this was a design decision.

Last I checked, the interface was shifting. And shifting often. The Eigen Tensor module’s development is driven by TensorFlow’s developers. I’m sure they have a handle on it because they’re contributing directly to Eigen and their use case is TensorFlow.

The math library implements functions that operate on Eigen::Matrix<var, -1, -1> types for matrices. There’s not a special matrix-var type if that’s what you’re asking.

What did you want to use tensors for?

Here’s the diff for the documentation (as a proxy to measure changes in interface)

Hi Bob,

Thanks for chiming in!

My question is precisely about the stan dev team’s interest in creating a matrix-var type. Such a thing exists for numpy:

Tensors are a bit of a red herring in this discussion, but to answer your question, I like the n-dimensional storage and broadcasting (n=4 in my case).

Are you imagining something like the following?

struct matrix_var {
  Eigen::MatrixXd val_;
  Eigen::MatrixXd adjoint_;

That makes it easy to operate on the adjoints as units, but introduces the problem of how to return a reference to an element of the matrix.

var& matrix_var::operator()(int i, int j);

Our var type is basically just

struct var {
  double val_;
  double adjoint_;

The only way I could see getting around this problem is having the var store references into the matrix type.

I’m curious what you have in mind for implementation and usage.

Yes, we’re also interested in getting tensor operations into Stan. Using std::vector<Eigen::Matrix> is rather clunky.

Hi Bob,

Yes, that’s what I had in mind. Thank you for explaining where the difficulty/impossibility lies.

Not sure it’s impossible, but nobody’s figured out how to do it. One option is to have some matrices act as units and not allow elementwise operations. Looks like that’s the route the Autograd package in Python took. We considered that at one point for covariance matrices, but it makes it hard to do things like build up a covariance matrix for a GP elementwise.

The current Stan organization should work with Eigen tensors. It just won’t be as efficient as working with matrices where we have specialized derivatives that save space and/or time.