FYI - Next Stan release (2.21) October 18 - now with feature freeze

I liked the feature freeze week. I didnt feel the anxiety to check if everything is polished and it felt nice that we had a week of testing to fix up stuff.

I dont think 3 days would be enough and 10 days seems too much. It is also true that there probably wont be such huge changes like now with adding a first dynamically linked library and also trying to add stanc3. But I would leave it at 1 week.

I think it went fine.

I think there’s a different way to handle this, at least from Math, that requires less effort. We could just take the latest development git hash that’s prior to the freeze date. And continue to merge as normal. The Math repo would be responsible, on the date of the freeze, to tag the candidate version.

This works if we start following gitflow a little more closely. In the event that there’s an issue between the freeze and the release date, we can patch it on the branch and publish a new candidate tag. This could happen up until the release date. In the event there was no change necessary (which, I hope is normal), we just release that hash that we tagged and we’re done.

How this could work for repos that depend on math? At freeze time, point to that git hash that the Math devs have marked as the potential release version. In the event there’s an update in the week, update that git submodule version. This works really well if we also follow gitflow for the downstream projects, but it’s not required.

I think a mono repo will require an actual code freeze because you can’t move parts forward independently. (That’s just a statement; I actually think there are a lot of benefits going back to a single repo.)

I think you’re saying it requires less effort from Math developers, but I actually think it requires more effort than the freeze did (and obviously more effort from downstream consumers). With a freeze, you just … don’t merge features in, and bugfix PRs execute normally. With your tag approach, you have to have a new branch that represents the frozen code and submit all bugfix PRs to that branch instead of to develop.

The only extra effort with a freeze comes in the scenario where there are two features that interact with each other and need to be merged to continue collaborating. So in that scenario, you can actually follow your procedure and create a new branch representing non-frozen development and submit these PRs to that branch. So if you assume there are more bugfixes and non-interactive PRs than there are tricksy PRs that interact with each other, you do less work with a freeze, and with a freeze all of that additional git management stuff happens on a non-critical branch instead of on the critical branch.

That works too! The only difficulty there is having to change the PRs that are pointed to develop vs pointing to something else. I was assuming the number of PRs that are in flight for develop are greater than the ones that would need to be patched in that freeze window.

I think we mitigate all the risk and it clears things up if we use tags anyway (no matter what we do.) Think that will work? We can do lightweight git tags that mark the frozen state. That way, downstream interfaces know exactly what it is instead of it being based on “the latest” version, which could be out of sync across different repos.

Btw, why don’t we use nvie git branching to handle feature freeze and releases https://nvie.com/posts/a-successful-git-branching-model/ ?

Takes more work, I think.

Nice, yeah I think that’s the crux of our disagreement. I think there are less PRs that need to be merged during a freeze week (which is a more stringent criteria than your “in flight” definition) than there will be bugfixes. I don’t feel that strongly; if each repo lead wants to start managing nvie-style release branches instead that works for me. I think it might take some infrastructure investment in the release scripts to make sure that works correctly - we definitely weren’t set up for hotfixes the last time I tried one.

@syclik not sure what you mean by using tags… git submodules always point to a git hash; we can’t use refs like tags or branches in that slot as far as I know. And when we release we do release a tag in each repo. Operationally I’m not sure what you mean by switching to using tags anyway.

1 Like

Just to re-emphasie: For me one week to hold my breath is fine… I mean things can live on a branch not in develop for 7 days… we have much longer delays for PRs not getting attention due to lack of time of people who are just busy; so 7 days is not a lot.

We should take as a measure of success if we end-up with a 2.21.1 release. If we can avoid that, then this is good. If we need it, then we need probably even more time.

4 Likes

Sorry about that… wasn’t communicating as clearly as I could have.

I’d suggest that each repo uses a branch or a tag, say something like x.y.z-rc (release candidate) as opposed to what was used for this release, develop (because that isn’t precise down to a git hash and it can change within that week). If we operated that way, each repo could decide to call develop the x.y.z-rc or branch at the freeze and update the tag throughout the week.

To put it another way, I think saying “use the latest develop branch” leads logically to a code freeze where nothing should be merged into develop for the sake of the release. If we said use tag x.y.z-rc then each repo has a choice of merging normal things into develop while continuing to maintain a stable branch for the release.

This follows closer to what @avehtari suggested, but we could also go all the ay and start using that fully. Or just stay with a code freeze. (It was just a little awkward. I had a little time and was going to review, then remembered last min that it was code freeze time and I really shouldn’t.)

I’m still trying to guess what you mean by “each repo uses a branch or tag” - you mean in the release scripts? You can’t have a submodule point to a ref, AFAIK, and that introduces some non-trivial testing burden given that we’re in 3 repos - we’d need to have a parallel set of builds set up to test release branches to make sure that as they get updated, we test the downstream repo with that new code on the release branch, etc.

It sounds like the freeze week wasn’t actually that bad and it might not be worth investing effort in that testing and release infrastructure given our scale.

One idea here would be to follow @bbbales2 lead and leaving a review that isn’t a github “Approve” but registers as their “Comment” and says something like “I’ll approve after the feature freeze so we don’t accidentally merge this.”

I think we’re describing the exact same thing!

In this release, every downstream project (of any project) had to use “the latest develop version” as of the freeze. At that time, it corresponds to a git hash, but over that week, even though there is a code freeze, develop might move to a different git hash to address some issues; that’s not really a problem, it’s just something we need to deal with.

But rather than freezing develop, we could have the process keep going as usual on the develop branch and tell all the projects to have a x.y.z-rc tag (or branch) that’s supposed to be used for the release. Then we’ve decoupled the normal operation of merging into the latest develop branch with a feature-frozen release candidate where the only things that are getting in are high priority fixes. (It’s getting closer to git flow this way.)

The day-to-day is to merge into develop, so to me, keeping that process the same is worth something. We can just keep contributing. New contributors don’t have to learn new rules for ~4 weeks out of the year. Reviewers that are just reviewing can do what they do. PRs can still be submitted and merged as normal. But being merged within that window won’t get it into the release. The release would be handled by devs that have more experience and are willing to handle the work involved with releasing (thanks all!). When things are released, we can merge the newest fixes back to develop (or do it as we’re merging into that branch / latest tag).

And one of the reasons branch / tag gets confusing here is because it could be done with both; if using tags, it will mimic a branch without being called a branch.

I’m still struggling on the concrete items behind what you’re saying. Is it correct to say that you’re proposing we implement the nvie release branch model and invest in creating that parallel set of Jenkins pipelines that keep release branches up to date and tested? For context, we need parallel pipelines because the ones we have now are what keep develop up to date when submodules change upstream.

I think everyone else who has commented has been reasonably happy with the freeze week so I’m tempted to keep using it until it becomes a problem and thus delay that investment. I would like note that I think it’s actually more difficult for new contributors with the nvie workflow - in freeze week, they can’t press the merge button because they’re new, so they don’t need to learn any additional steps, but with release branches they need to know where to submit bug fixes and how that workflow works. Freezing for a week is strictly less complicated than the full nvie model in both testing infrastructure and pedagogy at the expense of making coordination on interacting features slightly more complicated, which seems to naturally work well at smaller scales and work poorly at larger scales. We are at a comparatively smaller scale still.

It’s fine to go with a freeze week. I was just stating how it could be improved. (And I understand that’s not clear yet.)

What you think I’m proposing isn’t correct. I am not in favor of going fully to gitflow; it’s a lot of overhead for new contributors.

The concrete suggestion: rather than telling all the repos to use “the latest version of develop” exchanging “develop” with “tag v2.22.0-rc.” Both have the same problem of needing to know when that tag updates, but this just takes away the need to have a freeze.

But once again, it’s fine to freeze for a week. I found it easy to mess up and since it happens infrequently, I’m sure we’ll have to deal with an inadvertent merge in that week on one of the repos.

What’s supposed to happen if we do that? An immediate revert? That seems logical, but maybe we don’t need to do that?

1 Like

I’m so sorry, I’m not trying to be difficult, but I still don’t understand what you mean by “tell all the repos” hahaha. Maybe we need to hop on the phone again, lol. I think there are two concrete spots where develop appears in code - in the Jenkins code to update downstream repo submodules, and in release-scripts as the default branch to release (I think just for math as the other two ask what version to use). I made it semi-configurable here during my ill-fated attempt to do a hotfix a while back: https://github.com/stan-dev/ci-scripts/blob/master/release-scripts/tag-math.sh#L13. That hotfix release went fairly poorly, haha.

Yup, I know. I was assuming that I wasn’t getting through cause there was too much text around this and it’s not clear. I was just trying to distill it down where the only thing left is the concrete example.

Some things that I’m considering:

  • somebody or some people have to do work at release time. We would all prefer as least amount of work as possible, but it’s still a little work. (I don’t see any automatic release on a set day without dev intervention happening.)
  • the number of contributors is greater than the number of people that need to do work for a release. For example, in Math, there are 79 contributors now. But a fraction of that had to deal with the release.
  • for me, it’s easier when the normal operating conditions don’t change for the sake of something that happens rarely. Especially when if I do the same thing I do 48 weeks out of the year, it’s wrong in 4 weeks out of a year. It’s just easier to have things set up where you can keep on doing what you do.
  • there are multiple ways to address this. One is to set up permissions so we can’t merge that week. That works, but some people have to be able to merge that week. Another way is to put a big banner up saying please don’t merge, but given the permissions, everyone with the ability could still merge. I find it’s too easy to make a mistake.
  • it’d be nice if could strive to make something work consistently across the repos, but not necessary.

The goals here are to:

  • pick a git hash for the release on each repo (stan-dev/{math, stan, rstan, cmdstan, pystan, …})
  • prior to the release, test that those set of git hashes work together
  • at release time, finalize releases from those git hashes, pulling in docs (thus changing the exact git hash) and creating artifacts that contain the right versions.

Anyway, just trying to cut down on errors of well-intentioned developers (the error wouldn’t be that a PR is merged… it’s the error that it’s merged in that window and anytime outside the window it would have been fine) while still maintaining stability. I guess in my mind, a relatively simple request is for each repo to just tag a git hash as the one they propose as the release candidate (as opposed to a ref, latest develop, which isn’t really specific enough).