There was a debate on one of the closed Stan Math PRs regarding the release notes. I didnt feel like having another long winded discussion so I took 15 minutes and did a first draft.
Two questions:
Does anyone mind if we open a new wiki page where we keep an updated list of such one-liners? If you feel helping out this process just add a one-liner “Release notes” to each Math PR you open. If not, that is also fine with me, dont want to set new rules just for this. Its only 15 minutes/3 months :).
Any objections to the draft?
Stan-user facing changes:
added a new function to solve systems of nonlinear algebraic equations using Newton-Krylov methods.
re-implemented a reentrant safe and scalable version of lgamma
change the default boost_policy_t to avoid the double to long double propagation, this could affect the following functions: falling_factorial, rising_factorial, digamma and lgamma
implemented oridinal_logistic_glm_lpmf
fixed boundary conditions for log_sum_exp and log_diff_exp
GPU support for the following functions: gp_exp_quad_cov, mdivide_right_tri, multiply, mdivide_left_tri, normal_id_lgm, bernoulli_logit_glm_lpmf and poisson_log_glm_lpmf
Testing:
added a new autodiff testing framework for simpler and more automated tests
added finite difference versions of gradient, Hessian, and gradient of Hessian functions that automatically scale step size based on input
cleaned up test behaviour when -NDEBUG is set
tests for bernoulli_logit, poisson_log and negbin_2_log GLMs are now tested with y as std::vector instead of Eigen::Matrix for better test coverage of the usual use case from Stan
Refactor/Code cleanup/Technical debt:
refactor of the type traits to use more C++14 features & SFINAE
cleaned up rev/mat/fun with the eigen plugin methods .val(), .adj() and .vi()
typename *_return type was changed to *_return_type_t
removed deprecated Eigen content from math headers
metaprograms cleanups: added various enable_if_*
added use of const ref and ref to replace copies with moves
added clang-tidy to the makefiles
cleaned up unnecesary Boost and Gtest compiler flags
beta binomials lpmf loop and conditionals cleanup
clang-tidy cleanup of the Stan Math codebase (adding braces, replace typedef with using and NULL with nullptr)
the make test-math-dependencies file was rewritten from make to Python
OpenCL/GPU:
GPU support for prim/gp_exp_quad_cov, prim/mdivide_right_tri, prim/multiply, rev/multiply
fixed a bug with padding matrices
added the triangularity view to matrix_cl
the key for opencl kernel options is now a std::string not a const char *
added move constructors and assigments to avoid unnecesary copying
all references to “GPU” were changed to “CL”
changed the copy function to to_matrix_cl and from_matrix_cl
updated the OpenCL headers which also allowed for some OpenCL core code cleanup
Thank you Rok!! It would be nice to have the PR # and person who submitted so we can give contributors props. That’s a bit tedious, tmrw I’ll see if I can write a script that grabs info for all the PRs merged since the last release. @syclik do you have something like that already?
Do we have any speedup graphs for the OpenCL code? I can write up a draft tomorrow for the more narrative release notes (unless Dan or Bob wants to take the wheel on that of course)
I have no objections. I used to do it manually and it was more like 2 hours every release, but if you can do it faster, great!
Steve, nope. I did it manually using filters on the issues. Go for it!
Rok, you’re missing a couple user-facing things:
the lgamma PR changed the default compiler options; all interfaces should be updated.
numeric computations with the last version might have changed if it called boost; this was due to overriding the default boost policy with a custom one that’s changed this release. So this version may not replicate the last version.
We used to try to link the commits at one point and give props to bug reporters and anyone not on the stan-dev list.
Automating all this sounds great, but until then, great on Rok for doing it manually! I don’t think anyone will object.
It started with Daniel and I doing all the triage and writing all the release notes by scrolling through the commit messages, issues, and PRs. That was a terrible process and we missed a lot of stuff.
At the risk of starting a debate, I wouldn’t have even called that a debate! The only issue in that discussion that touched release notes was whether they could be automated by including a special field in PRs.
I still think it’s good to have the end Stan user in mind when creating, editing, working on, and releasing issues in the Math library. Adding a release notes block that is to be written from the Stan user’s perspective would suffice.
Do you mean “creating releases” instead of “releasing issues”?
I also think it’s good to think about and write for the Stan user, but it’s an awkward policy to force issues in Math to be written that way. If that’s what we all agree on, that’s fine, but think about how this would affect all the other parts of the Stan ecosystem. Do you think it’s going to be productive to write stanc3 issues so that it’s written for the Stan user?
I’m ok if we decide to adopt a general policy to state how an issue affects the Stan user (this may be left blank if it’s an internal change). One reason to not do this is because it adds yet another requirement on the person adding the issue / PR (wherever the policy is applied). There’s also an issue that the person writing the issue or code may not have the knowledge to write this down. In that case, what happens? Is the issue supposed to be rejected?
There were a bit over 50 PRs open since the last release, some of them were closed before merge. Most of the merged once I was either the author or the reviewer or they had a large enough effect on my work that I was aware of them. I only had to actually click on maybe 5 PRs. So I had it easy this time.
Thanks for those. Also added the Newton solver added yesterday. Will keep the top-level comment updated until the release.
That is fair. I just felt we circled around this in various places in the last few months.
Alrighty since we have the feature freeze I went through and wrote up an R script to grab all the merged PRs and wrote up a few release notes for this cycle. The gist below has both the R script to grab the PR info along with a little narrative for the changes this release.
I tried to tag all parties and mention their contributions. If anyone would like to change the wording lmk!
Overall this is really really great. Thanks for doing it!
Here are my minor comments on the release notes.
PR #1180 brought the Intel TBB into Stan as a dependency, which we will be using in the future for CPU parallelism all across Stan! The TBB is an excellent framework that will let Stan utilize nested parallelism across the algorithms and gradient evaluations. On top of that, it also supports OpenCL interop, so we will start experimenting with integrating out OpenCL backend with the TBB. The licensing for the Intel TBB library is under the Apache 2.0 license. This dependency implies an additional restriction as compared to the new BSD license alone. The Apache 2.0 license is incompatible with GPL-2 licensed code if the software if distributing the software as a unitary binary. Refer to the Apache 2.0 evaluation page on the Stan Math wiki.
I would remove the “On top of that, it also supports OpenCL interop, so we will start experimenting with integrating out OpenCL backend with the TBB.”. I would focus on the features that are added as of now. And I feel that this steals the focus of TBB and may confuse users to connect TBB with OpenCL.
With @t4c1’s large contributions, we now have GPU/OpenCL support for many of our glm functions! With these additions along with @rok-cesnovar’s work on integrating OpenCL optimizations in the new stan3 compiler should give users a pretty big speedup for large glm models! @rok-cesnovar added an OpenCL reverse mode specialization for multiplication and mdivide_left_tri while @t4c1 added the OpenCL specialization for gp_exp_quad_cov .
The second sentence is not valid anymore as 2.21 will be using the exisiting compiler, so I would just remove it. And the first sentence should somehow note that gpu support is only available inside Stan Math and not to the cmdstan users so people dont get the wrong expectations. They will be available immediately once we switch to stanc3.
I (@Stevebronder) added a small type traits metaprogramming scheme so that we can make use of more generic templating in a lot of our code.
The sentence should start with “@Stevebronder added”. And the “small” her is not valid in my opinion. It was a huge effort and it helps developers a lot! Dont sell yourself short here!!
I would add a heading before the last paragraph with “Future”, “What is next?”, “A look at the next release”.
When reading through the text - which is really nice overall - I also thought that we should only focus on what we are releasing and not what we plan to do with it. I think that we should not include things like what we plan to do with these things in release notes. These forward directed things can certainly be mentioned in an announcement of the release, but release notes are just a record of what’s in it; for me.
The release is today! Is there a canonical set of release notes we should include instead of the standard stuff for Math? I could also imagine release notes that are Math-dev specific (to be included in the Math repo RELEASE_NOTES.txt file) and then some other ones that we might instead include higher up in Stan release notes that are more language-facing.
Just want to tag @serban-nicusor as the release-performer so he’s aware - I think these folks would like to write the release notes for Math this time.
Looks good to me. @wds15 or @rok_cesnovar or @stevebronder (or?) - any of you want to write something up showing how much faster map_rect with threading is now for the Stan release notes and pick a favorite graph to link to?
We are now using TBB for threading in map_rect. With performance tests on a non-trivial map_rect model we have observed speedups of up to 20% on Windows, 70% on Linux and 30% on MacOS. Speedups were observed for both Intel and AMD CPUs. On MacOS we observed 25-30% speedups even for single threaded models when using tbbmalloc.
We can remove the last line if you find it confusing for users. I feel that MacOS users could really benefit here by simply turning on CXXFLAGS= -DSTAN_THREADS. But that doesnt need to be a part of the releases.
As for the chart I like this one from Steve (I think its a 12-core AMD CPU):
Side note and way to late now but why do we not have something like STAN_THREADS=true like we do for OpenCL? I find that much simpler to explain and use.