Cmdstan 2.23 Release candidate is available!

rok_cesnovar · April 14, 2020, 11:02am

I am happy to announce that a release candidate for the next release of Cmdstan is now available on Github.

You can find it here: https://github.com/stan-dev/cmdstan/releases/tag/2.23-candidate

Why a release candidate?

We want to do a more wide and thorough test of the next version before the official release planned for next week. To make sure nothing gets missed.

And that is why we need you, the users, to try out the release candidate. Play with new features or just compile and run your models. And let us know what you think.

How to install?

Download the tar.gz file from the link above, extract it and use it the way you use any Cmdstan release.

If you are using cmdstanpy, make sure you point to the folder where you extracted the tar.gz file with set_cmdstan_path(os.path.join('path','to','cmdstan'))

With cmdstanr you can install the release candidate using

install_cmdstan(release_url = "https://github.com/stan-dev/cmdstan/releases/download/2.23-candidate/cmdstan-2.23-rc1.tar.gz", cores = 4)

What is new?

The highlight of this release is a new way of parallelizing your models. Other than that, the focus of this release was mostly bugfixes and stability/consistency for edge cases.

A quick rundown of the most notable changes is listed below:

New features:

introduces reduce_sum and reduce_sum_static functions that provide a new way of parallelizing a single chain across multiple cores. A tutorial on this new feature can be found here (based on the popular map_rect tutorial by @richard_mcelreath). You can also read the pre-release of the new user guide docs here.
added OpenCL (GPU) support for GLMs when using the “~” syntax (previously only supported with target +=)
upgraded to Sundials 5.2.0
including files snytax supports < > brackets

Notable user-facing bugfixes:

Fixed problems with vectorizing neg_binomial_* functions that lead to wrong answers
Fixed compiling of user-defined rng functions
Improved lbeta to be more numerically stable with one large and one small argument
Improved numerical stability with binomial_coefficient_log, neg_binomial_2_lpmf, and neg_binomial_2_log_lpmf computations
Fixed problem with wrong gradients for large arguments to log_sum_exp
Fixed bug where normal_id_glm did not work with a sigma is not an autodiff type
Cleaned up makefile error messages on Windows
Better checks for positive-definitnes in mdivide_left_spd
More consistent throwing behaviour on QR functions
Clearer messages when using a variable name as a keyword
Arbitrary spaces allowed betweeen words in “transformed data” and “transformed parameters”

There are also some ongoing refactors and projects that don’t have a direct user-facing impact at the moment like for example:

generalizing functions in the Math library
adding complex numbers support
more OpenCL supported functions via the kernel generator

These currently do not have an effect on your models, but will in the future (with speedups or new features). We want to also make sure those changes in the background didn’t break any of your models.

Thanks,

Stan Development Team

mike-lawrence · April 15, 2020, 4:36pm

reduce_sum is super cool. Obviously map_rect has been available for a while, but reduce_sum seems more user-friendly. Makes me wonder if this might be the final straw in supporting a change in the typical workstation parallelism approach from sampling chains in parallel across cores to sampling chains serially but with multiple cores for each chain. Previously I only really saw the latter as beneficial for scenarios like models with GPU acceleration and a only single GPU available. But now that I’m thinking about serial chains and within-chain parallelism, maybe another benefit is the ability to check that the first chain is getting reasonable results before bothering with subsequent chains? I know serial chains will break the cool campfire stuff for pooled warmup, but possibly a similar approach where later chains can get info from all earlier chains might be just as good?

mike-lawrence · April 15, 2020, 6:52pm

Oh, and I wonder if it might make sense to have a version of dot_product that internally uses reduce_sum. Possibly only useful with very large contrast matrices?

wds15 · April 15, 2020, 8:44pm

I am glad you like reduce sum! Interesting thought that it may be better to give first full resources to a single chain instead of running chains in parallel and combine their Info ala campfire. It will be interesting to evaluate this. Within chain parallelism can be inefficient for some models compared to running multiple chains. However, combining like campfire comes with compromises and only happens every now and then if I recall right. When you run one chain faster with more cores does not need to make any compromise. It will be interesting to test this. My guess is that a combined approach will be best, since doubling the speed of a model with two cores is easy, but with more cores it gets easily harder to have such good scaling.

Running dot product with reduce sum should be possible…but you can already now code up a version, I think. Doing it in Stan math is also possible and likely a bit more efficient.

mitzimorris · April 17, 2020, 10:38pm

testing latest changes on develop branch for reduce sum, per discussion with @bbbales2 and @wds15 - should we do another RC?

wds15 · April 17, 2020, 10:41pm

I asked myself the same…so maybe yes?! If it’s not too much burden to do, of course.

rok_cesnovar · April 18, 2020, 1:23pm

Yes, lets play it safe.

mitzimorris · April 18, 2020, 4:23pm

Release candidate 2 is up - thanks @rok_cesnovar and @serban-nicusor!

RC2 here: https://github.com/stan-dev/cmdstan/releases/tag/v2.23-rc2

rok_cesnovar · April 18, 2020, 5:02pm

Thanks!

The changes in RC2 are:

changed orders of argument in reduce_sum (tutorial and users guide have also been updated)
reduce_sum now works with user defined _lpdf/_lpmf
offset/multiplier and lower/upper can now appear in any order
fixed runtime initialization error caused by code generation
fixed foreach looping over sliced arrays

Thanks @nhuurre @rybern and @bbbales2 for the fixes!

mitzimorris · April 19, 2020, 12:59am

reduce_sum tutorial is now a Stan case study: https://mc-stan.org/users/documentation/case-studies/reduce_sum_tutorial.html

sam_learner · April 19, 2020, 4:57am

The following links in the tutorial and the case study seems to be broken 404 not found:

Reduce Sum :https://mc-stan.org/docs/2_23/functions-reference/functions-reduce.html
Picking up grainsize : https://mc-stan.org/docs/2_23/stan-users-guide/reduce-sum.html#reduce-sum-grainsize

I apologise if this is not the correct discussion thread to report a bug. I am a beginner trying to parallelize my code by following the tutorials as my model is stuck while sampling from large dataset.
I would really appreciate if anyone could please point to alternate documentation I can refer

mitzimorris · April 19, 2020, 2:01pm

those links won’t work until the 2_23 release is out and we’ve published the 2_23 docs - coming soon!

mitzimorris · April 19, 2020, 2:05pm

if you’re OK with raw markdown, the source code for the user’s manual is here: https://github.com/stan-dev/docs/blob/master/src/stan-users-guide/parallelization.Rmd
and the functions reference chapter is here: https://github.com/stan-dev/docs/blob/master/src/functions-reference/higher-order_functions.Rmd

sam_learner · April 19, 2020, 3:30pm

thanks so much. I am happy to refer to raw markdown

Topic		Replies	Views
CmdStan, Stan 2.23.0, Math 3.2.0 released! General	12	1689	May 13, 2020
Help with reduce_sum Modeling	32	1585	August 4, 2020
Stuck at Warmup iteration with no error : CmdStanR CmdStan techniques , fitting-issues	48	3333	April 21, 2020
Four chains vs four jobs General cmdstan	28	394	June 19, 2024
Stan 2.25 release candidate! Announcements release	18	1931	October 19, 2020

Cmdstan 2.23 Release candidate is available!

Related topics