CRAN Problem: LMMELSM hitting clang-UBSAN errors

Hey all,

I have a couple of R packages using RStan on CRAN. One of these is LMMELSM (Latent multivariate mixed effects location scale models).

I received an alert from CRAN last month that it is hitting a clang-UBSAN error. I cannot figure out what is triggering this issue.

I use rstantools, I have updated the rstantools configuration in the latest submission, and I do not see any errors in my tests. Their UBSAN seems to trigger on the lmmelsmPred.stan file.

Github for lmmelsm: GitHub - stephensrmmartin/LMMELSM: R Package for fitting latent multivariate mixed effects location scale models.
CRAN UBSAN Error page: https://www.stats.ox.ac.uk/pub/bdr/memtests/clang-UBSAN/LMMELSM/tests/testthat.Rout

The main lines seem to be:

LMMELSM.Rcheck/tests/testthat.Rout:/data/gannet/ripley/R/test-clang/RcppEigen/include/Eigen/src/Core/Block.h:347:25: runtime error: applying non-zero offset 400 to null pointer
LMMELSM.Rcheck/tests/testthat.Rout:/data/gannet/ripley/R/test-clang/RcppEigen/include/Eigen/src/Core/Block.h:347:25: runtime error: applying non-zero offset 8 to null pointer
LMMELSM.Rcheck/tests/testthat.Rout:/data/gannet/ripley/R/test-clang/RcppEigen/include/Eigen/src/Core/Block.h:374:24: runtime error: applying non-zero offset 8 to null pointer
LMMELSM.Rcheck/tests/testthat.Rout:/data/gannet/ripley/R/test-clang/RcppEigen/include/Eigen/src/Core/Block.h:374:81: runtime error: subtraction of unsigned offset from 0x000000000008 overflowed to 0x000000000008
LMMELSM.Rcheck/tests/testthat.Rout:/data/gannet/ripley/R/test-clang/RcppEigen/include/Eigen/src/Core/Block.h:374:24: runtime error: applying non-zero offset 8 to null pointer
LMMELSM.Rcheck/tests/testthat.Rout:/data/gannet/ripley/R/test-clang/RcppEigen/include/Eigen/src/Core/Block.h:374:81: runtime error: subtraction of unsigned offset from 0x000000000008 overflowed to 0x000000000008
Please fix and resubmit. 

Does anyone have any clues or hints about what can cause these types of errors? I don’t believe I do anything particularly exotic in the stan files. The test data contains no NAs, and that same data is used elsewhere in lmmelsm test runs that throw no errors.

Moreover, the actual tests run fine - I can run all tests successfully on my machine, my osx vm, my windows vm, on rhub, etc. The only apparent problem is with UBSAN flagging these issues.

I’m able to reproduce the error on an M1 mac running r-devel with the following flags:

CXX14FLAGS += -fsanitize=undefined
LDFLAGS += -fsanitize=undefined

Now that I’ve got the errors I’ll keep digging and let you know what I find

1 Like

The warning/error is coming from Eigen::Block when Stan is indexing a matrix row. I’m not sure exactly why this happening, but you can avoid it by updating the l1_to_l2 and mat_to_mat_array functions to copy via looping rather than vectorised indexing. The performance difference should be negligible once it’s compiled to c++:

  matrix l1_to_l2(matrix l1, int[] indices) {
    int K = size(indices);
    int n_col = cols(l1);
    matrix[K, n_col] l2;
    for (k in 1:K) {
      for(c in 1:n_col) {
        l2[k, c] = l1[indices[k], c];
      }
    }
    return(l2);
  }
  matrix[] mat_to_mat_array(int R, int C, matrix mat) {
    int K = rows(mat);
    matrix[R, C] out[K];

    for(k in 1:K) {
      int ind = 1; // Index starting position
      int c = 1;
      int r = 1;
      for (m in 1:cols(mat)) {
        out[k, r, c] = mat[k, m];
        if (ind == R) {
          ind = 1;
          r = 1;
          c += 1;
        } else {
          r += 1;
          ind += 1;
        }
      }
    }

    return(out);
  }

I’d also recommend double-checking that my changes above correctly recover the intended matrices/dimensions, since I threw them together quickly.

Hope that helps!

1 Like

Cool!

Thanks so much for debugging this. I always thought that fancy indexing with matrices was the culprit (purely out of intuition), but was never able to actually debug this as you have done so here. Great.

Do you think that this be solved with rstan 2.26?

Do you think that this be solved with rstan 2.26?

Same error with 2.26 unfortunately! I’ll also check the experimental branch, see whether it was resolved by 2.30

No dice with experimental, I think it’s something Eigen-specific so could be tricky to chase down

1 Like

is it going away with Eigen 3.4? If not, then it sounds as if we need to think about patching Eigen?? Argh…

Hey @andrjohns

Sorry for not getting back to this thread until now. Thank you for digging into this! I’ll try making these patches this weekend. Really - I can’t express how thankful I am for this; I was at my wits end trying to figure out wtf was triggering this, and I have no experience with the UBSAN stuff.