Current status of matrix-normal and array-normal

Hi everyone -

Following some chats with stan devs and others at StanCon, I wanted to start some discussion of kronecker-structured covariances or array/matrix normal distributions. This is something I’ve wanted for work I’m doing, and I’ve seen some interest in the users’ list as well, e.g. https://groups.google.com/d/msg/stan-users/t9FksfRBdBo/Cp_fxLpyAAAJ, https://groups.google.com/d/msg/stan-users/iddShgJVwTQ/YmsPkJu52WQJ.

What is the current status? Here is what I have dug up:

  1. It seems like there is code for the precision-parameterized matrix normal in https://github.com/stan-dev/math/blob/develop/stan/math/prim/mat/prob/matrix_normal_prec_log.hpp. There is an issue https://github.com/stan-dev/math/issues/86 that suggests the current botteleneck to exposing this into stan is having reverse-mode tests, but as far as I can tell some additional tests have been added since the last update on that issue that I think are reverse-mode tests (test/unit/math/rev/mat/prob/matrix_normal2_text.cpp). But I don’t understand stan’s autodiff or test structure well enough to be sure.

  2. There is an issue for multiplying a kronecker-separable matrix by a vector (K1 %x% K2) %*% y (https://github.com/stan-dev/math/issues/393). From what I can tell based on https://github.com/stan-dev/math/issues/60, the intent of this is to be able to construct kronecker quadratic forms for matrix-normal densities t(Y) (K1 %x% K2) %*% Y. In general for only a 2-factor matrix normal I don’t think this is needed: I think you can rewrite the trace term in the likelihood as K1 %*% t(X) %*% K2 %*% X, and the determinant terms can be split out as well. This is needed for more than two covariances, though.

  3. For higher-order separable covariance, there’s the additional question / design decision (mentioned in issue #60) of the data structure to hold what is basically a list of covariance matrices.

I don’t understand auto-diff but can try to make headway on #1. I have code for doing #2 that I will try to port to C++ and contribute, unless someone’s working on it already. #3 I think requires broader consensus. Please let me know what I’m missing. Cheers,

Mike.

If you don’t understand autodiff, check out our arXiv paper,
which starts from scratch and goes through how Stan implements
reverse mode:

https://arxiv.org/abs/1509.07164

It’s the key to writing efficient derivatives in Stan.
We still need to write the forward mode version, but we don’t
use forward mode anywhere yet because it’s still not thoroughly
tested.

I think the biggest bottleneck for (1) is finding someone to
shepherd the issue through. Every function needs doc and autodiff
testing. I don’t know what the state of this one is, because I
don’t even know what a matrix normal is. Also, no idea about (2).
But for (3), you can just have an array of covariance matrices
as long as the covariance matrices are all the same shape. We
don’t have list types yet, but I hope to get to that this year.

  • Bob