Ragged arrays bug in the language

We’ll be adding ragged arrays to Stan this year, so we should be able to remove the hacks for ragged arrays and replace them with proper solutions.

At that point, I’m going to argue very strongly that we should fix the existing bugs that don’t respect sizes.

For example, this code

print("m = ", to_matrix({ rep_array(1.0, 100), { 0.0 }}));

will try to read 99 values from raw memory:

Chain 1: m = [[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],
 [0,4.34241e-311,1.49167e-154,1.49167e-154,1.99766e-314,6.94786e-310,6.94786e-310,6.94786e-310,6.62048e-322,0,4.94066e-323,2.12899e-314,4.94066e-324,2.12899e-314,1.4822e-323,2.12899e-314,1.97626e-323,2.12899e-314,2.12899e-314,9.88131e-324,2.12899e-314,2.47033e-323,2.12899e-314,2.12899e-314,3.45846e-323,2.129e-314,2.47033e-323,2.12899e-314,4.44659e-323,2.12899e-314,2.12899e-314,5.43472e-323,2.12901e-314,5.92879e-323,3.45846e-323,2.129e-314,2.47033e-323,2.12899e-314,2.12899e-314,2.47033e-323,2.12899e-314,3.45846e-323,2.12899e-314,2.12899e-314,6.91692e-323,2.12899e-314,7.90505e-323,7.41098e-323,2.96439e-322,2.12899e-314,8.39912e-323,2.12899e-314,9.38725e-323,2.12899e-314,9.88131e-323,2.12899e-314,8.89318e-323,2.129e-314,1.03754e-322,2.12899e-314,1.13635e-322,2.12899e-314,2.12899e-314,1.23516e-322,2.12901e-314,1.28457e-322,8.89318e-323,2.129e-314,1.03754e-322,2.12899e-314,2.12899e-314,1.92686e-322,2.12899e-314,2.12899e-314,2.12899e-314,1.03754e-322,2.12898e-314,0,1.99766e-314,6.94786e-310,6.94786e-310,6.94786e-310,1.38338e-322,0,6.94786e-310,6.94786e-310,6.94786e-310,6.94786e-310,6.94786e-310,6.94786e-310,6.94786e-310,6.94786e-310,6.94786e-310,6.94786e-310,6.94786e-310,6.94786e-310,6.94786e-310,6.94786e-310,6.94786e-310,6.94786e-310]]

This’ll likely lead to segfaults on other platforms. If you reverse the order of the rows in the array, you get a 1 x 1 output from to_matrix.

The runtime assignment is also missing checks for reassigning. So you can do this in a current Stan program:

  int a[2, 2] = { { 1, 2 }, { 3, 4 } };
  a[1] = {1, 3, 4};  // a is no longer 2 x 2!
  print("a = ", a);

which will print (from RStan):

Chain 1: a = [[1,3,4],[3,4]]

The intention in the language was for the size of a variable not to change once it’s created. So this is technically a bug w.r.t. the intention of only supporting rectangular data structures.

Our functions are not designed to work with this kind of raggedness, as they assume all inputs are rectangular and don’t raise exceptions if they’re not. I’m guessing some of them will just segfault on other platforms.

Our sampler’s also not prepared to deal with this kind of resizing. Stan currently allows this buggy block to compile:

transformed parameters {
  real a[2, 2] = { { 1.0, 2 }, { 3.0, 4 } };
  a[1] = {1.0, 2, 3 };
  print("a = ", a);
}

How many elements in a? 4 according to our MCMC output, but 5 according to the print statement.

I don’t believe we should allow this kind of inconsistnecy in the language.

1 Like

I would say all of those are bugs that should be prohibited in Stan 3 or sooner. The question from this morning was whether

functions {
  real foo(vector[] x) {
    real s = 0;
    for (n in 1:size(x))
      s += sum(x[n]);
    return s;
  }
}

is considered a bug if it is called like

data {
  int<lower = 0> N_y;
  int<lower = 0> N_z;
  vector[N_y] y;
  vector[N_z] z;
}
transformed data {
  print("sum = ", foo( {y, z} ));
}

even if N_y differs from N_z. One reason to consider this behavior to be not a bug is that it returns the only plausible answer. However, if foo utilized some function like num_elements that presumes arrays of vectors are not ragged, then that function would be buggy unless we changed the implementation of num_elements. I think we do not know how many functions in the Stan language are assuming all arrays are non-ragged.

I’d still consider that a bug. We should be raising an exception when the user tries to create {y, z} if N_y != N_z.

I’m going to work hard on developing ragged arrays to support this kind of behavior properly going forward. I was thinking of making it a different data structure, though, not just generalizing rectangular arrays.

1 Like

It is hard to say it is a bug if foo yields what everyone would say is the expected behavior. I would say that num_elements and any similar functions that assume arrays are non-ragged should be fixed because someone could be calling those Stan Math functions directly in C++ with ragged arrays and not realize they are getting the wrong answers.

It was unexpected to me because the intention was that all arrays were rectangular. That’s why I called it a bug rather than expected behavior.

With ragged array types, we’ll be able to preserve constant sizing.

Good point. I don’t see any harm in generalizing the math lib functions.

If the problem is

We can’t pass ragged arrrays created on the fly to user-defined functions because some Stan Math funcions are expecting arrays to be non-ragged.

then I think those Stan Math functions should be generalized. And if they are generalized, what is the harm in allowing ragged arrays that are created on the fly to be passed to user-defined functions? Throwing an exception just becomes an antifeature at that point.

Stan arrays are declared to be rectangular. Maintaining invariants on data objects like rectangularity makes it much easier to deal with the objects downstream. Functions are just one example (e.g., size can be generalized, but to_matrix will have to throw). I/O is another.

The ragged array spec lets us have size-invariant types like we have now that will be usable in any context. Many of the functions will work now or will be easy to overload with ragged types.

The question is what to do now when we have a situation where functions will fail or give strange behavior with rectangular types.

Do any of the other people working on the language like @seantalts or @matthijs or @mitzimorris or @rybern have an opinion? Or other users?

The runtime assignment bug is that stan::model::assign() – unlike stan::math::assign() – does not check array sizes, right? Is there a reason these namesakes behave different? I don’t see why the math library needs its own assign function.
By the way, Stanc3 never uses stan::math::assign so all assignments can resize.

Here’s all the Stan Math functions that assume input arrays are rectangular (that I could find)

append_array
dims
divide_columns
gp_dot_prod_cov
gp_exponential_cov
gp_exp_quad_cov
gp_matern32_cov
gp_periodic_cov
log_mix
map_rect
num_elements
operands_and_partials
to_array_1d
to_matrix

Some of these only make sense for rectangular arrays. Others are easy to generalize.

Next, let’s look at the generated code.

transformed parameters {
  real x[1000];
  vector[100] y;
  x = rep_array(10,1000);
  y = rep_vector(-1,100);
}

Stanc2 generates (without comments or statement numbers)

validate_non_negative_index("x", "1000", 1000);
std::vector<local_scalar_t__> x(1000, local_scalar_t__(0));
stan::math::initialize(x, DUMMY_VAR__);
stan::math::fill(x, DUMMY_VAR__);
validate_non_negative_index("y", "100", 100);
Eigen::Matrix<local_scalar_t__, Eigen::Dynamic, 1> y(100);
stan::math::initialize(y, DUMMY_VAR__);
stan::math::fill(y, DUMMY_VAR__);
stan::math::assign(x, rep_array(10, 1000));
stan::math::assign(y, rep_vector(-(1), 100));

As far as I can tell, initialize and fill do the same thing?

Stanc3 forgets to initialize arrays.

validate_non_negative_index("x", "1000", 1000);
std::vector<local_scalar_t__> x;
x = std::vector<local_scalar_t__>(1000, 0);
validate_non_negative_index("y", "100", 100);
Eigen::Matrix<local_scalar_t__, -1, 1> y;
y = Eigen::Matrix<local_scalar_t__, -1, 1>(100);
for (size_t sym1__ = 1; sym1__ <= 100; ++sym1__) {
  assign(y, cons_list(index_uni(sym1__), nil_index_list()),
    std::numeric_limits<double>::quiet_NaN(), "assigning variable y");
}
assign(x, nil_index_list(), rep_array(10, 1000), "assigning variable x");
assign(y, nil_index_list(), rep_vector(-1, 100), "assigning variable y");

Does that assigment in the loop allocate new (useless) varis on the autodiff stack?

No, and I’m not sure why there are two of them. I’d prefer there just be one of them in the math lib. And I’d really really like to allow general assignment of int[] to real[] and the same for function arg callls.

Yes. I think they realized when building stanc3 that ther were redundant initializations in the old compiler.

There are a lot more. For example, to_matrix makes that assumption. Lots of them don’t test rectangularity, but just depend on it.

Yes, if y is of type var. There should be a single value generated that is reused, and it shouldn’t be on the autodiff stack.

I transcribed your post into a stanc3 issue.

Hey @Bob_Carpenter Is there any update on ragged arrays?