Hey all, let’s remember to keep the tone civilized. Regardless of our disagreements we want this board to be welcoming and remember that at the end of the day, we’re all on the same team here and have extremely similar goals for scientific computing.
@betanalpha is the domain lead for algorithms and has the authority; he just needed me to actually press the merge button. I don’t have any comments on the contents of the PR.
If you or @bbbales2 want to challenge his decision there with the TWG Director (me for now, maybe we change the whole structure soon but let’s talk about the current process) you should feel free. I’ll take your post as essentially starting the ball rolling on that; if that’s not your intent let me know.
I’m not familiar with this code but I’m familiar with this kind of debate; it’s come up a bunch of times in other modules and in many other jobs as a software engineer - we’re essentially arguing about the API of Stan w.r.t. the main HMC sampler. For the Math library, the last time we had an argument about the API we went ahead and formally defined it (late; it had been required since the decision to adopt semantic versioning across the entire project, but never fleshed out).
Since we don’t have an HMC or algorithms API yet, I’d like to ask @betanalpha as the lead to draft one that we can get comments on in the next few days and use to decide if this PR breaks anything in that API and requires a major version bump. If it does, we can pull it out of this 2.21 release before Friday.
I think once the API is specified explicitly, it will be much more clear what kinds of tests are in-scope for any given PR and what are out-of-scope. These tough decisions must still be made but ideally not totally on a PR-by-PR basis. Judgment will always be involved, but everyone will be happier if there are policies and guidelines for these decisions so everyone knows what to expect.
The algorithms are a really tricky part of Stan because it really requires a lot of diligence and mathematical background to understand why the code works the way it does. That said, I’m a huge fan of finding ways to empirically test anything we want to promise to our users to make sure we know when it changes (and we can make those tests easy to update if need be). So maybe we can find a compromise here where we figure out what we really think is important to promise users and come up with some innovative new tests that verify that.