October 2.25 release?

I believe the plan is that the next version of rstan after 2.21 will be 2.24 and that will have stanc3.

1 Like

Oh sorry I’m talking about code like

  vector[N] mu = Intercept + r_1_1[J_1] .* Z_1_1;

That in the C++ get’s translated to

assign(
  mu, nil_index_list(),
  add(Intercept,
      elt_multiply(
          rvalue(r_1_1, cons_list(index_multi(J_1), nil_index_list()),
                 "r_1_1"),
          Z_1_1)),
  "assigning variable mu");

One of the more expensive parts of that is the line

 rvalue(r_1_1, cons_list(index_multi(J_1), nil_index_list()), "r_1_1")

Since J_1 in a vector of integers that’s essentially a random access copy operation. If we are lucky that index is like

{1, 2, 3, 4, 5, 6}

but unless it’s the data’s primary index it will probably be something like

{1, 2, 12, 14, 43, 44, ...}

This is one place sparse matrices would be v nice, since then we could just have Z_1_1 be a
sparse_matrix[N, K]
and we would just do an efficient sparse matrix product
vector[N] mu = Z_1_1 * r_1_1

1 Like

Oh okay the thing to keep in mind is they are very similar. I’m not sure there is a way to reorder these matrices outside of the unique groupings for the intercept we talked about previously.

So this:

Z_1_1 .* r_1_1[idxs]

is a sparse matrix multiply where we know each row of the sparse matrix has exactly one element.

So maybe the variables look like this:

idxs = { 0, 1, 1, 1, 2 }
Z_1_1 = { 1.0, 2.2, 5.5, 7.7, 8.8 }
r_1_1 = { 3.1, 2.0, 1.7 }

r_1_1 is just a vector, but the sparse matrix representation for idxs/Z_1_1 is (using the notation from Wikipedia):

V         = { 1.0, 2.2, 5.5, 7.7, 8.8 }
COL_INDEX = [ 0, 1, 1, 1, 2 ]
ROW_INDEX = [ 0, 1, 2, 3, 4, 5 ]

So the sparse version of this isn’t giving us much other than an extra array of integers (ROW_INDEX).

I hope I got that right but if I didn’t here is the dense form of the matrix:

[ 1.0, 0,   0,
  0  , 2.2, 0,
  0  , 5.5, 0,
  0  , 7.7, 0,
  0  , 0  , 8.8]

If you target brms… then maybe also consider the recent changes due to reduce_sum which hopefully play well with the things you outline. Basically this means that working on slices of the data and parameters is also working fast.

Yes but for the data having it as a sparse matrix is just a cost we pay once to construct that tho where for

Z_1_1 .* r_1_1[idxs]

We make that temporary r_1_1[idxs] vector in every iteration of the model (and have to pay the random access copy cost). If we had sparse data and did

data {
  int K;
  integer[K] n_nz;
  integer[K] m_nz
  // size N x M with nonzero's at (n_nz, m_nz)
  sparse_matrix[N, M, n_nz, m_nz] Z_1_1;
}
//....
mu += Z_1_1 * r_1_1;

That’s just a sparse x dense multiply which should be more efficient (and eigen also has an openmp backend that it can use for sparse dense multiplies that we could use)

1 Like

do we have a specific release date, code freeze, RC etc?

for my part, I’d like to do as much as possible towards documenting how to troubleshoot the install/upgrade process by adding stuff to the online CmdStan docs that can be linked to in the release notes, etc. I opened an issue in August - feel free to keep contributing suggestions: https://github.com/stan-dev/docs/issues/268

The plan is to Tag today the rc.

Yes, everything as usual and according to the dates mentioned above.

There was not much action in Stan/cmdstan this cycle (6 non CI PRs alltogether) so we only have a Math release issue: https://github.com/stan-dev/math/issues/2128 and a stanc3 issue

If there will be no objections, @serban-nicusor can start making RCs tomorrow CET time.

5 Likes

Hey, I’ve done the RCs for math, stan and cmdstan ( including stanc3 nightly binaries ).
You can find them here: math, stan. cmdstan.

I did not include the release notes in github so we don’t have them again on the release next week.

5 Likes

Thanks!

@rok_cesnovar … will you announce on the forum so that people can test?

I did run my usual test model (mixture logistic regression) on macOS. Compared to 2.24.1 I am getting a ~12% slowdown with 2.25.0rc1 which reduces to less than 1% if I turn on STAN_COMPILER_OPTIMS=true.

Here is Stan model and data:

blrm.stan (28.4 KB) combo3.data.R (3.9 KB) test.R (1.4 KB)

It’s still odd to require the optimisations for good performance given this model, but ok, whatever.

1 Like

Steve volunteered to do the RC forum post this time. @stevebronder, are you still up for that? Else I can do it if you are busy. Let me know. It would be great to get that out today. Tuesday will be here soon.

Yeah, we can still rethink turning this on by default. The freeze period is meant just for that. We have a week just for testing.

We kind of decided this on a hurry, based on make issues that turned out being something small (thanks for debugging that Sebastian).

Could stanc3 receive a rc tag as well? httpstan needs a specific version of stanc3 to download.

Also, it doesn’t seem like cmdstan’s version is particularly well defined since it doesn’t download a specific version of stanc3. It downloads whatever stanc3 is tagged “nightly”, I think. That’s a moving target. (See https://github.com/stan-dev/cmdstan/issues/923 for some background.)

Cc @serban-nicusor

No need to make new cmdstan tarballs I think.

This is not completely correct.

Release cmdstan tarballs come with the stanc3 binaries and thus does not download any stanc3 binary on build. It will use the release stanc3 no mattter how many times the users bulids/cleans.

The nightly binary is downloaded only if you use a clone of cmdstan. This is fine, as develop is always a moving target and should thus always use the latest version of stanc3, same as it always uses the latest Math.

The only other way is if you go and manually delete the stanc3 binaries in the release (make clean-all is not enough, you have to really know which files to remove manually). This is not something one would do normally or at all.

This issue fixed a different problem. If you went back in git commits to for example 2.23, you had no way of downloading the 2.23 stanc3 (besides manually downloading it). You can now with @syclik’s fix.

3 Likes

Hey, here you can find stanc3 v2.25.0-rc1. As soon as the jenkins build will finish you will also find all the binaries attached on the release page.

2 Likes

I didn’t know this was the case. Is there a different tar.gz for each platform (i.e., with platform-specific stanc3). Or will the different stanc3s all come in the same tarball?

For now they all come in the same tarball. Its only a few MBs wasted.

We might have to consider splitting this once macOS on ARM becomes popular. That will require separate binaries for macOS ARM and macOS x86

We technically should already do that for Linux ARM and Windows ARM systems, though those are not even remotely widespread (so probably not worth it) as the new macbooks will become next year.

Got it. So the way one learns which version of stanc3 is associated with
a specific release of cmdstan is to execute the binary and discover the
version? The version number is not contained anywhere in the release
tarball nor in the repository tree associated with the tagged commit?

To get the stanc3 version you need to do

Make build
./bin/stanc —version

The version is also in the tarball name.

It’s still odd to require the optimisations for good performance given this model, but ok, whatever.

### 2.25 without optims
real 186.59
user 185.90
sys 0.55

### 2.25 with optims
real 174.98
user 174.32
sys 0.50

### 2.24.1
real 171.72
user 171.04
sys 0.56

Edit: Nvm about the native thing I didn’t recompile the model after running the one with optimizations