[This is a continuation of the discussion on GitHub in a PR. It made sense to start discussing it on a forum instead of a closed PR. Start of the discussion: stan-dev/math#2096]
@bbbales2 and I were talking about symmetric matrices in the Stan Math library.
There is an absolute tolerance in the
check_symmetric() function. We don’t use
== equality to check for symmetry (which wouldn’t make sense for IEEE floating point numbers anyway). @bbbales2 suggested we use relative tolerance and I wholeheartedly agree. Note: we don’t have a symmetric matrix type. This is just an Eigen matrix that has the property of being close enough to symmetric.
I mentioned that because numerically it’s not truly symmetric, we may end up with operations that result in non-symmetric output even if the linear algebra dictates that it should be symmetric. And this may fail if the output of one function is being used as an input to another function. I had some ideas on how to deal with this and wanted to just write some of them down. (These are just ideas… no prototypes or anything associated with it yet.)
- Create a symmetric matrix type in Stan. We could either use a different C++ type or use Eigen’s triangular views. This would be a bit of work.
- Rewrite functions to use Eigen’s triangular views after checking for symmetry. I’m not sure how this will interact with the autodiff stack… (do we need to copy
stan::math::var to the off-diagonal right before returns?)
cc: @bbbales2, @rok_cesnovar, @tadej, @bgoodri, @stevebronder… please tag whoever else should see this.
I’ve never found the symmetric functions in stan very helpful, possibly due to tolerance issues. I think it would be neat to ‘somehow’ allow a toggle for them such that in one mode the symmetric functions check for symmetry, in another they force it (perhaps with a check and warning).
I think both of suggestions would need to be used together. To simplify the implementation we can use
Eigen::SelfadjointViewinstead of triangular views, which for a real matrix means symmetric view. These internally use a triangular part of matrix anyway. To enforce inputs being symmertic we would also need to have a symmetric matrix type in stan language.
Also tagging @andrjohns.
Hmm, well the problem that kicked this off was a return matrix not being symmetric which was just a bug.
The checks themselves were throwing too many errors and so that makes me want to not have more constraints/checks in place cause that means more errors.
I think there’s been discussion before of adding properties to matrices (like symmetric). I’m not sure though. I think there’d need to be a big computational motivator to do something like this.
I think with stanc3, we might be able to do things like this easier than with the C++ transpiler.
I think there are some obvious gains in computation (checking a symmetric matrix when you know it’s symmetric should be a no-op), but more importantly, there could be a class of Stan programs that are easier to write and handle. If we cut out the numerical issues of non-symmetry, then we’d help in cases like @Charles_Driver has seen (I’ve seen that too). And I’m sure there is some linear algebra that can be simplified if it’s a known symmetric matrix.
Hmm, maybe. I’m still a bit nervous about it cause the slam dunk isn’t obvious to me.
I was working on some code yesterday (https://github.com/stan-dev/math/blob/0469d6f93207a0200d694cd904cdda5608d79b70/stan/math/rev/fun/tcrossprod.hpp#L36) has the line:
arena_M.adj() += (adj + adj.transpose()) * arena_M_val;
that could be:
arena_M.adj() += 2 * adj * arena_M_val;
if the output was actually created as a symmetric matrix (if
A(i, j) was actually
A(j, i)). So that’s something, but my guess is that it’ll be harder to get symmetric matrices into the math library than just ignoring that performance hit on tcrossprod.
I interpret the stuff Charles brought up as tolerance problems which can hopefully get fixed by changing how check symmetric works so I think that’s easier to fix.
(your whole response was great. thanks!)
I hear you. This is the way I feel about optimization of most things. I think it really starts making sense if it starts manifesting as fewer unconstrained parameters in Stan programs, less numerical issues that lead to divergences or rejections, or less code that’s easier to maintain.