Strange errors with to_vector

I’m getting a strange error trying to extract a series of values from a matrix as a vector.

The model works fine when using a loop (where M = 5524):

  vector[M] yst_tmp;
  ...
    for(m in 1:M) {
      yst_tmp[m] = ystar[ii[m],jj[m]];
      thr_tmp[m] = thresholds[gg[m],jj[m]];
    }

  y ~ ordered_logistic(yst_tmp, thr_tmp);

However when using to_vector:

  vector[M] yst_tmp;
  ...
  yst_tmp = to_vector(ystar[ii,jj]);

  for(m in 1:M) {
    thr_tmp[m] = thresholds[gg[m],jj[m]];
  }

  y ~ ordered_logistic(yst_tmp, thr_tmp);

The model compiles, but errors when running with:

Exception: ordered_logistic: Integers has dimension = 5524, expecting dimension = 30514576;

This implies that the vector yst_tmp is much larger than its declared size M, which shouldn’t be possible.

Any ideas for how to debug this further? I’ve attached the two models and test data to this post.

F1_Base_Loop.stan (2.3 KB) F1_Base_to_vector.stan (2.4 KB)
test_data.R (84.1 KB)

2 Likes

Does this happen with stanc2 and stanc3? This sounds like we should fix it ASAP.

More illuminating error with stanc2:

Unrecoverable error evaluating the log probability at the initial value.
Exception: assign: Rows of left-hand-side (5524) and rows of right-hand-side (30514576) must match in size  (in 'examples/Test/F1_Base_to_vector.stan' at line 63)

I had a look through the generated c++, and (I think) the vector is being declared and initialised correctly:

stanc3

Eigen::Matrix<local_scalar_t__, -1, 1> yst_tmp;
yst_tmp = Eigen::Matrix<local_scalar_t__, -1, 1>(M);

for (size_t sym1__ = 1; sym1__ <= M; ++sym1__) {
  current_statement__ = 42;
  assign(yst_tmp, cons_list(index_uni(sym1__), nil_index_list()),
             std::numeric_limits<double>::quiet_NaN(),
             "assigning variable yst_tmp");}

Then the assign:

assign(yst_tmp, nil_index_list(),
          to_vector(
            rvalue(ystar,
              cons_list(index_multi(ii),
                cons_list(index_multi(jj), nil_index_list())), "ystar")),
          "assigning variable yst_tmp");

stanc2

Eigen::Matrix<local_scalar_t__, Eigen::Dynamic, 1> yst_tmp(M);
stan::math::initialize(yst_tmp, DUMMY_VAR__);
stan::math::fill(yst_tmp, DUMMY_VAR__);

Then the assign:

stan::math::assign(yst_tmp, to_vector(stan::model::rvalue(ystar, stan::model::cons_list(stan::model::index_multi(ii), stan::model::cons_list(stan::model::index_multi(jj), stan::model::nil_index_list())), "ystar")));

So it looks like a combination of two errors here:

  • to_vector returning a vector of the wrong length
  • stanc3 allowing the rvalue vector (of wrong length) to resize the lvalue vector

Are you sure its to_vector?

This is the original code:

assign(yst_tmp, nil_index_list(),
        to_vector(
        rvalue(ystar,
            cons_list(index_multi(ii),
            cons_list(index_multi(jj), nil_index_list())), "ystar")),
        "assigning variable yst_tmp");

Turned around a bit:

auto ystar1 = rvalue(ystar,
        cons_list(index_multi(ii),
        cons_list(index_multi(jj), nil_index_list())), "ystar");
assign(yst_tmp, nil_index_list(),
    to_vector(
    ystar1),
    "assigning variable yst_tmp");
std::cout << ystar.size() << std::endl;
std::cout << ystar1.size() << std::endl;
std::cout << yst_tmp.size() << std::endl;

Outputs
5530
30514576
30514576

So seems more like an rvalue issue?

I’ve always been scared of expressions like ystar[ii,jj] where both ii and jj are multi-indices because I could never be sure what it really does. Based solely on the fact that

5524\times 5524 = 30514576

I’d guess what you need is

yst_tmp = diagonal(ystar[ii,jj]);

(That is of course very inefficient.)

The reference manual suggests this is the intended behavior.

Ah, if it’s intended behaviour then the only issue is stanc3 not throwing an exception for the dimension mismatch in assign.

Here’s a small example that throws an exception under stanc2 but not stanc3:

data { 
  real y_mean;
} 
transformed data {
  int ii[3] = {1,1,2};
  int jj[3] = {1,2,2};
  matrix[2,2] test_mat_data = [[1,2],[3,4]];
  vector[3] out_vec_data = to_vector(test_mat_data[ii,jj]);
}
parameters {
  real y;
} 
model {
  y ~ normal(y_mean,1);
}

1 Like

I believe this is the same as stanc3 issue #523.

stanc was known to be overly lax prior to stanc3 compared to my original intention to have sizes not change.

The simple way to think about what it should do is this:

y[ii, jj][m, n] = y[ii[m], jj[n]]

As a standalone expression, the size of y[ii, jj] is size(ii) * size(jj). And that does get evaluated. It’d be really nice if we could instead use an expression template for y[ii, jj].

In general, it works like this for K <= N,

y[ii1, ii2, ..., iiN][m1, ..., mK] = y[ii1[m1], ..., iiK[mK], iiK+1, ..., iiN]

This applies to any form of multi-index.

P.S. It’d also be super useful to have a function defined as

foo(y, ii, jj)[k] = y(ii[k], jj[k])

so that we can do interactions. But I don’t know what to call foo.