Thanks for the reference—this is very useful. Especially for the operations they support. And as a guideline for how we can write doc.
I’m relieved to see they use pretty much the same thing that’s being proposed here up to the differences between Stan and Python.
There’s also some more extensive TensorFlow ragged package doc here:
https://www.tensorflow.org/api_docs/python/tf/ragged
Basic Typing
Because of our static typing and fine-grained linear algebra types, we have to deal with distinctions between arrays of reals, vectors, row vectors, and matrices in type declarations and in constructors. We also have to deal with sized and unsized declarations.
They have a similar restriction to homogeneity of indexing depth (what they call the rank of the tensor).
What they call shapes corresponds to Stan’s size declarations. For instance, the shape of a two-dimensional ragged array is the sequence of sizes of its one-dimensional arrays. TensorFlow is more flexible in letting you do everything by row or column in specifying shapes.
Like Stan, they distinguish ragged structures from sparse structures. @stevebronder is working on a sparse matrix/vector design.
Constructor Expressions
These look very similar in the proposal for Stan and in TensorFlow.
Maps
> digits = tf.ragged.constant([[3, 1, 4, 1], [], [5, 9, 2], [6], []])
> print(tf.add(digits, 3))
<tf.RaggedTensor [[6, 4, 7, 4], [], [8, 12, 5], [9], []]>
We don’t have arithmetic operations over our arrays, but we’ll be able to map like this once we have unsized local variables and anonymous functions (those specs are coming as soon as I transfer from paper to computer).
vector[] y
= map({real x}.x + 3,
{[3, 1, 4, 1]', []', [5, 9, 2]', [6]', []})
Reduction
> print(tf.reduce_mean(digits, axis=1))
tf.Tensor([2.25 nan 5.33333333 6. nan],
shape=(5,), dtype=float64)
Appending
> print(tf.concat([digits, [[5, 3]]], axis=0))
<tf.RaggedTensor [[3, 1, 4, 1], [], [5, 9, 2], [6], [], [5, 3]]>
Slicing
> print(digits[:, :2]) # First two values in each row.
But I don’t like this R style:
print(digits[:, -2:]) # Last two values in each row.
Distribution functions, etc.
Things that take normal tensors appear to be generalized to take ragged tensors. We’ll hope to do exactly the same with Stan. Much of this will be easy because our use of std::vector
under the hood means the algorithms don’t need to change as the checks for rectangular sizes is done outside of function calls.
Conversion functions
They have some useful conversion functions to and from the kinds of long form we use now.