Static/immutable matrices w. comprehensions

I was mulling over a discussion we had a while back at StanCon Helsinki about checkpointing the derivatives for covariance matrices constructed for Gaussian processes while reading the section of Dougal Maclaurin’s Ph.D. thesis on Autograd that explained that they went with static matrices. The upside to static matrices is that you can keep the memory local for a matrix and for its adjoints (another matrix). The downside is that you can’t do assignment, only read.

real a = b[1, 2];  // reading is OK
b[1, 2] = a;       // assigning is illegal

Stan’s parameter and data matrices are already static in this sense, as are our transformed data and transformed parameter matrices once the transforms happen. But Stan allows general assignment to matrix variables in blocks in which they are declared. We can’t support that pattern with static matrices.

What we could do with static matrices is what we were talking about in Finland—allow something like Python’s comprehensions. I’m thinking something like the followin (please don’t get hung up on the clunky syntax):

static_matrix[M, N] a = matrix_comprehend(f, M, N);

where the comprehension is something like this:

matrix matrix_comprehend(real(int, int) f, int M, int N) {
  matrix[M, N] y;
  for (n in 1:N)
    for (m in 1:M)
      y[m, n] = f(m, n);
  return y;
}

We’d probably want something like a closure for f so that it could do things like pick up a data matrix in forming a GP covariance matrix.

Once we have static matrices, any operation involving matrices can produce more static matrices. And we could allow assignment of a static matrix to a regular matrix and vice-versa for convenience (if not efficiency).

4 Likes

I like this as a language feature, wonder if we could set something up in the compiler to look over a program’s matrices and if there is never assignment declare it a static_matrix

3 Likes

Yes, the compiler wizards should be able to do that with dataflow analysis. But we know the critical ones of parameters and data and transformed data can be static once defined. It’s only local variables and transformed parameters that are problematic, but that’s just where the GP covariance matrices are defined. It might be worth promoting the transformed parameters to static matrices but it’s not clear they get used multiple times to make that worthwhile.