Hoping we’ll get the threading fix for Windows and performance in. We probably aren’t going to wait longer than Monday for this release as it’s been a while already and we have a lot to release already.
@seantalts, I’ll start putting together the list of bug fixes and features for Math unless you want to put that together.
For the release notes? That would be super helpful, thank you! :D
For the release we should bump the marked boost version in the README of Stan-math. We forgot to update that one, it’s still 1.66.0.
v2.19.0 (18 Mar 2019) ====================================================================== New Features ------------ - GPU - matrix multiplication (#974) - inverse of lower triangular matrix (#1028) - Operator overloading for GPU functions (#1056) - Cholesky decomposition (#1058) - specialized reverse-mode implementation for cholesky decompse (#1117) - Host doxygen API doc on https://mc-stan.org/math/ (#500) - Makefile completely rewritten (#581, #954, #1041, #1043, #1087) - Adding `beta_proportion` distribution (#1018) - adjoint vector-Jacobian product form of precomputed gradients for reverse (#876) - Add alternative inv_logit parameterization to prevent underflow (#874) Bug Fixes --------- - Improved derivative for Gamma CDF w.r.t. alpha (#525) - `value_of` incorrectly returned the wrong type (#968) - `sum` incorrectly returned the wrong type (#987) - `matrix_exp` incorrectly passed the argument by values (#769) - Unit testing with Windows on Jenkins (#1046) - `gp_cov_exp_quad` was computing the ARD mixing up rows and cols (#984) - Fixing GoodGammaP for gcc 7.3 (#1063) Other ----- - Clarity on what's being tested in Math (one compiler per OS) (#943) - Updated GitHub templates (#911) - Improve ODE speed (#1049) - Fix tests for threading (#1058) - Upgrade Google Test to v1.8.1 (#1051) - Upgrade Sundials to v4.1.0 (#1097) - Matrix exponential action: - A fast implementation was implemented (#771), but it had errors (#) - Currently, a slow implementation is in the codebase. - Improve the codebase: - Code spacing (#587) - Using varidic template parameters for `return_type`, `partials_return_type`, and `include_summands` (#977) - Fixing math constants definitions for Windows (#986) - Avoid ambiguous instantiation of `math::sqrt()` by implementing for `double` and `int` (#712) - Clean up GPU code: - Seperate OpenCL kernel access into it's own class (#973) - `read_only` and `write_only` decorators in GPU kernels fail in Windows (#1034) - Fixing uninitialized values in tests: - `bernoulli_logit_glm_lpdf` test (#995) - `check_greater` test (#819) - `gp_exponential_cov_test` failing (#1150) - Updating template parameters of matern32 (#981) - Update `gp_dot_prod_cov` (#979) - Deprecating old GP covariance function names (#756) - Fixed compiler warnings in `test-headers` (#1110) - Adding required headers (#1106) - Turn test-math-dependencies warnings into failures on Jenkins (#1078) - Replace `boost::type_traits` with `std::` versions (#1126) - Fix doxygen errors (#1139)za - Clean up anaonymous namespace usage (#1006) - Setting STAN_NUM_THREADS to illegal value should produce an error (#947)
I also haven’t written about the threading PRs that haven’t been merged yet.
I am going to summon @Erik_Strumbelj for writing up something on the GPU feature. He is usually better with words than I am :)
It seems that I have been summoned. :)
I wanted to check will ragged arrays be included in 2.19 ? (Sorry to intrude but I’m about to start rewriting something where they might be useful).
GPU features look great!
Nope, sorry! Few months down the road at least.
Slow to arrive, but fast to compute: Stan has GPU support!
Stan 2.19 brings GPU-optimized computation to Stan users. The first supported function is Cholesky decomposition, the main bottleneck of many common statistical models. Activating GPU support is easy - only a few lines are added to the configuration and no changes have to be made to the Stan model. Cholesky decompositions of larger matrices (including their gradients, when dealing with parameters) are then automatically transferred to the GPU with speedups ranging from 10 to 30, depending on matrix size and GPU.
Other GPU-optimized matrix algebra primitives and common statistical models are soon to follow: matrix multiplication, lower triangular inverse, eigendecomposition, GP covariance functions and several GLMs. The implementation is based on OpenCL, so it can be used with any GPU and GPU programming-savvy users can also add their own custom OpenCL kernels.
It have tested that it works as is with CmdStan and I have made the necessary few lines of changes to RStan and I’m just making a pull request to get that in RStan.
Let me see if I can get the tests passing on it. Was there any reason we were waiting to merge other than the travis tests?
Now that I have the fix for RStan, I don’t know any reasons not to merge. Jenkins tests are passing.
This change can still mess vb in other interfaces, but it’s easy to fix in each interface as shown by the RStan fix. And CmdStan was working already. And this affects only vb so there is no danger breakin anything else.
A moment ago @syclik wrote “@avethari: merge when you think it’s ready.”, but develop is protected and I get “You’re not authorized to merge this pull request.”
@syclik found the problem and I just merged that PR a moment ago.
Quick update - Jenkins is still churning through
develop after a few fixes today and yesterday, and we have another accepted PR that would be great to get in for the inaugural GPU launch. Once that’s merged there will be another ~12 hours of tests kicked off, which puts me well past my bed time. So I’m going to call it and say we will release tomorrow, or as soon as Jenkins finishes putting everything through that’s approved.
Just wanted to chime in that I agree with Dan and Erik’s wording.
@seantalts Can I get the covariance functions into the language or is it too late for that?
I’ll stay up late and kick off tests if you need someone to do that.
If you can get it merged tonight?