October 2.25 release?

Hi!

According to my counting we should be approaching a 2.25 release end of this month, October. Are we up for that? There seems to be still a ton of acrtivity for those generalize expression stuff which would be nice to have in 2.25 in a form where we actually use it (assuming we are not far off from completion). Other things which cross my mind would be good to get in is the PR which turn the lto compiler option on if possible… and I am sure there is more to quickly align on.

Sebastian

EDIT: And not to forget the super cool var<matrix> stuff which should really come to live if possible

1 Like

Yep, according to our regular schedule, the freeze is on for the 12th of October and the release on the 19th.

I think there are under 10 distributions to generalize and it should be close to finishing then. Then we need to enable returning expressions. @tadej any ETA on all of that? I guess you have all of it ready, just needs reviewing?

Yes, https://github.com/stan-dev/cmdstan/pull/926 needs to be merged before the release, otherwise we have a regression on our hands.

That PR is probably the only thing that needs to be merged in any case. The rest would be really nice to have, but not sure how much we can get in before the release.

Yeah there are 7 distributions left. However I ran out of stuff I already completed, so I still need to implement these. There are also some uncompleted functions and some functions that are supposed to work, but are not tested yet. After all the functions are working I still need to make functions return expressions. All this will be completed soon, but I think not in time for the release. I don’t think delaying the release waiting for this makes sense. However we do get speedups for many of the distribution functions that are already merged in the release.

That is far from ready. Not for this release and probably not for the next one either.

2 Likes

@tadej Thanks for the details!

Do you mean in the sense where all functions would use var<matrix>? I imagine we could use it for some before that? Provided that we implement stanc3 sides of things? Or is even that not feasible for the January release? This one is probably a stretch.

In that case we stil need to decide how are we using it in a partial form and what changes that needs from the compiler. That might be possible for the january release.

2 Likes

Sounds like a plan.

So, the 2.25 release brings some speedups with the distributions, a bunch of bugfixes I presume and lots of change under the hood - or is there a bigger user facing thing? That’s totally fine, don’t get me wrong.

Mostly speedups (hopefully no slowdowns - not everything was benchmarked). There were not many bugs caught during generalizations. There are no user-facing changes.

Unrelated, but one user-facing change is that GLMs can now run on GPUs also if derivations of x or y is required. @rok_cesnovar Is compiler side of this completed?

The biggest user-facing non-bugfix is in my opinion the vectorized binary functions by @andrjohns: https://github.com/stan-dev/stanc3/issues/643
I think the users will really like that one. That was also a ton of work by Andrew.

Other than that and some OpenCL related stuff it was mostly under the hood and bugfix stuff.

Yes and merged.

Will also try to add combining all the var_value<matrix_cl> supported functions (multiply and sum + cholesky if it gets merged) with GLMs that will remove the intermediate transfers.

4 Likes

Soon!

Well there are changes and a lot of them will be in this release.

@stevebronder what’s the status bar on everything? What do you think we can finish by the 12th?

1 Like

Soon!

We have a bunch of PR’s moving for functions that take in var<matrix>. This PR for subsetting them needs to go in so I can open up a PR for this branch in the stan repo (which should also speed up regular slicing stuff which is nice). We have to get those subsetting PRs in before we can do stuff up in the compiler.

If we have any immediate and obvious functions that go faster with the new reverse_pass_callback() stuff then it would be a good idea to just have a PR with those so we can make nice gains with them. I’d like to also pot-shot some stuff BRMS does in loops right now that we could write in matrix form with reverse_pass_callback() to be faster. Like I have elt_multiply() and add() PRs** and I think the speedups there will overshadow the cost of the multi index copy we need to do to for the hierarchical portion of the model.

** I just saw the reviews for the add PR ty!

I’m meandering with the flto stuff. It’s easy peasy to get the speedups with gcc, but clang is very finicky. The one good thing is that R will have much better support for lto next release so by the time this version of stan math hits rstan that should be simple.

Once the cholesky PR is passing I’m happy to review it!

The compiler setup will be finicky for partial but I think things are going slower only because we are still kicking the tires in figuring out how to make the patterns both look nice and be performanant.

If we really really went all hands on deck I think it would be possible to get the expression stuff in this release, but imo I’d rather wait till next release so we are not in a mad frenzy near the end. It’s perfectly reasonable to have a release with some nice slice of life features + performance speedups.

2 Likes

You’re talking about how in:

vector[N] mu = alpha[group_idx];

We need to save a copy of group_idx for the reverse pass?

Or when you say multi-index here are you referencing something specifically with multiple indices like alpha[group_idx1, group_idx2]? Or is it something else?

I agree. All the various products seem to be running a lot faster even when it’s just mat<var> s.

I’d like to get all the reverse_pass_callback implementations in to have that done. I’m pretty enthusiastic we can work out the kinks without too much difficulty.

2.25 will also very very likely be the last version that will have the STANC2 backfall option (the last remaining open issues will be closed this week).

Does anyone oppose us listing that in the release notes?

1 Like

What are the decision criteria here?

Ideally we drop stanc2 as backfall options once rstan is up to date with the current stan and successfully running all those models in the CRAN world. Once that is the case we can be really sure that stanc3 is fully backwards compatible. From what I see that could close indeed (standalone functions and all that fun stuff).

But maybe we should anyways pull the trigger, so I am fine with listing that in the release notes…probably worth to run by the next Stan meeting?

rstan compatibility is getting really close. Standalone functions is ready.

I think we only need to figure out if there is a nicer way of handling includes (for nicer numbering of lines on errors). There are no other issues to my knowledge. If there are, they should be posted on stanc3 issue list.

Anyway, I am certain stanc3 will be officially rstan-compatible at some point in the next release cycle. And I think warning users ahead of time is better. That is why I brought this up.

Sure. Lets do that.

1 Like

Sure, we should inform well ahead of time. I was just wondering how many releases in advance is something to align on.

2 Likes

No stanc2 in 2.26 works for me

I’m very much in favor of advertising imminent demise of STANC2.

I agree that the sooner the better, but until RStan is fully ready and has been demonstrated to work with Stan3 for at least 1 release cycle, we need to keep the STANC2 backfall.

if you want decision, putting it up for a vote is the way to go - the “general” Stan meeting is not the proper venue for technical decision making, nor is it a good one

I would fully support this route. We can get rid of stanc2 once rstan is known have switched and is released to CRAN with a working stanc3 for at least one release cycle. Then its time to let go stanc2.

+1 to these points - I don’t feel strongly, but I agree that it’s nice to keep the stanc2 backup option in RStan for at least one release cycle.

And as someone who no longer attend the meetings (always during work!) I am in favor of attempting to at least expose more impactful discussions either on these forums or on calls scheduled on the forums for specific topics. Would love to see more chats getting scheduled if anyone has the desire, please feel free to invite me or just give me a ring to chat about code :)

That said, not sure if I believe everything needs to be a vote if it seems like there’s a rough consensus of the people who might plausibly care. Polls on discourse are pretty lightweight and I think they’re useful once all sides have had made their case, but hopefully 9 times out of 10 we come to a rough consensus before that. Just my $0.02.

Is Rstan going to switch to Stanc3 for 2.24 or is it staying with Stanc2? Stanc2 is missing out on reduce_sum, the new ODE signatures, and HMMs and that’ll be annoying to have to remember the differences (and explain that 2.24 does not mean the same thing across interfaces).