StanCon Cambridge developer meeting opportunities

My scheduled arrival is 11:25 on Wednesday… so that would be tight for me… sorry. 12:00 should work.

Sebastian ended up in a traffic jam so we’re going to try moving the parallel/KINSOL etc to 17.30 tonight at the bar

@avehtari I’m here: Start or schedule a Google Meet video meeting - Computer - Google Meet Help

Posting the slides for sparse matrices! We are waiting for Aki to finish rn and will be on once he is done

https://htmlpreview.github.io/?https://gist.githubusercontent.com/SteveBronder/9818d7e0f45b8f51f78bb035328676ab/raw/4d90c97e5765b9a062cd03bef3964e00847d9e06/sparse_stuff.html#/

Thanks! Good slides!

Let me just note that one possibly is to hide all sparse calculations underneath specific calculations (i.e. sparse_quadartic_form, sparse_determinant, etc) so that the user doesn’t have to deal with exposed, varying sparsity patterns. That would drastically simplify the problem.

Otherwise I think we need to introduce an explicit eval function that casts a lazy sparse representation into a dense representation, zeroes and all, which would allow element-wise access and the like. Otherwise I don’t think we’d be able to maintain something that’s sustainable across the varying circumstances of a sparse program.

1 Like

I’d be more tempted to go the other way and not allow element-wise access outside of the transformed data block. Basically the savings you get from sparse matrices are in very specific operations, so you use them differently to dense matrices.

Most of the use cases that would hope to get the good speedup of a sparse matrix don’t vary elements directly, but rather through combinations of

  • scalar multiplication
  • Sums
  • multiplication of sparse (rectangular) matrices

Its one of those “limit it to geometric operations” and have proper (compound) declarations that assigned a fixed sparsity structure and we are fine.

2 Likes

I have no problem with that; the Stan3 compiler should have no problem handling that kind of context-specific usage, right @seantalts?

Do the operations form an algebra within each sparsity pattern? In other words will one be able to construct sparse types with a fixed sparsity pattern and then have the model be able to verify consistency as, for example, the linear algebra dimensions are verified?

There are different ways of implementing this stuff - it sounds like we’d want a whole new type in the Stan language to do what you’re saying. The original drafts I looked at had sparse matrices being equivalent to matrices, just that some functions could work with them more efficiently. I haven’t looked at backend considerations too deeply yet, but if it was true that it was a pure performance consideration and we could just handle that on the backend, it would feel more elegant / consistent / true to a model’s statistical meaning to say that a matrix’s sparsity status doesn’t impact the Stan code you write. Just a thought.

I don’t think this is possible. You need to be very careful with sparsity for it to have any benefit at all.

Once the sparsity pattern is fixed, then yes under addition and scalar multiplication. (We might want to have C = A + B make sense when the sparsity pattern of A and B are subsets of the sparsity patter of C. In that case it’s an even more useful thing.)

1 Like

But how is that sparsity pattern calculated? The problem is being able to write something like

sparse_matrix[pattern1] A;
sparse_matrix[pattern2] B;
sparse_matrix[pattern?] C = A + B;

The user would need to know the pattern of the sum a priori or be able to do something like sum_pattern(pattern1, pattern2). In other words there need to be consistent type systems for the sparse matrices and the sparsity patterns to get this kind of behavior.

1 Like

Meaning you can’t do certain things or else it will be slow?

This is an issue. We talked about it in the meeting and it’s a bit in the comments to the design doc, but basically for the compound assignment-type things, we would need a routine that computes the sparsity. This is common for sparse matrix operations and is a necessary thing to sort out if we want sparse matrices to work.

Well some things will change the number of vars used to represent the matrix (which is bad), and some of them would be catastrophically slow. (Sparse matrix operations do not revert to dense matrix operation, but rather the slowest possible expression of the dense matrix operation.)

It’s best to think of sparse matrices not as a matrix that we can do some fast things to, but rather a completely different type that shares some similarities with dense matrices but can’t be used interchangeably with them. I don’t think we can abstract this away from the user. (Or if we can it would require everyone having a lot more experience with using sparse matrices to do complex things. Right now that experience is thin on the ground. But there might be a “version 2” of the sparse matrix spec that hides some of these details, I just don’t think we can build that yet.)