CmdStan 2.29.0 release candidate

I am happy to announce that the latest release candidates of Cmdstan and Stan are now available on Github!

This release cycle brings new differential-algebraic equation solver, new functions and distributions, function overloading, array type promotion, improved optimization in the Stan compiler, improved error messaging, new deprecation warnings and much more.

You can find the release candidate for cmdstan here. Instructions for installing are given at the bottom of this post.

Please test the release candidate with your models and report if you experience any problems. We also kindly invite you to test the new features and provide feedback. If you feel some of the new features could be improved or should be changed before the release, please do not hesitate to comment.

The Stan development team appreciates your time and help in making Stan more efficient while maintaining a high level of reliability.

If everything goes according to plan, the 2.29 version will be released in two weeks.

Below are some of the highlights of the new release.

Differential-Algebraic Equation solver

Stan now supports solving differential-algebraic equation systems (DAEs). DAEs can be considered as an extension of the concept of ordinary differential equations (ODEs), so that the system may contain algebraic equations that constrain the state variables and state derivatives, and the state derivatives may not be explicitly expressed as the right-hand-side of ODEs. Instead, the relationship of state variables and state derivatives could be implicitly expressed in a residual function. Similar to ODE solvers, the DAE solver dae and dae_tol support variadic signature.

This can be done using two new higher-order functions: dae() and dae_tol(). The interface of the new functions is similar to the one for ODEs. For more details see the docs that are currently available here. These docs will later be available in the typically online form.

New functions and function signatures:

  • Von Mises CDF functions: von_mises_cdf, von_mises_lcdf and von_mises_lccdf. Provisional docs link

  • Log-logistic distribution: loglogistic_lpdf, loglogistic_log, loglogistic_rng and loglogistic_cdf. Provisional docs link

  • Inverse of the complementary error function: inv_erfc. Provisional docs link

  • bernoulli_logit_glm_rng function. Provisional docs link

  • Additional ordered_probit_lpmf signatures:

    • ordered_probit_lpmf(array[] int, real, vector) => real
    • ordered_probit_lpmf(array[] int, real, array[] vector) => real
  • Additional normal_id_glm signatures:

    • normal_id_glm_lpdf(real, matrix, real, vector, real) => real
    • normal_id_glm_lpdf(real, matrix, vector, vector, real) => real
    • normal_id_glm_lpdf(vector, row_vector, real, vector, real) => real
    • normal_id_glm_lpdf(vector, row_vector, vector, vector, real) => real
  • Additional signatures for lchoose. Now matching those supported by the deprecated binomial_coefficient_log.

Function overloading

User-defined functions can now be overloaded and can overload core Stan functions. Multiple definitions of the same function name are allowed if the arguments are different in each definition.

Example of functions that will work with 2.29:

functions {
   real foo(row_vector p, real a) {
   	return sum(p) + a;
   }
   real foo(vector p) {
   	return sum(p) + 1;
   }
}

Example of an overloaded core stan function is

functions {
   array[] row_vector transpose(array[] vector a) {
   	// ...
   }
}

Stan compiler optimization levels

Stan-to-C++ compiler has had an experimental feature to optimize the model before compiling it to C++. With the 2.29 release, the optimization were split into 3 levels: --O0, --O1 and --Oexperimental. --O0 disables optimization and is currently used by default. --O1 uses optimizations that are simple, do not dramatically change the program, and are unlikely to noticeably slow down compile times are applied. These optimizations include dead code elimination, copy and constant propagation, automatic-differentiation level optimization and detection of opportunities to represent parameter vectors and matrices as structs-of-arrays (see below).

Finally, with --Oexperimental the Stan compiler will use all available optimizations, some of which are not thorougly tested.

New optimization to better utilize vectorization and memory througput

By default Stan uses a so-called Array-of-Struct approach of representing vectors or matrices of parameters, meaning that the value and adjoint of each element of a container are stored next to each other in memory. The opposite approach is that the values for all parameters in the container are stored close to each other and then the adjoints in a similar fashion separately. This can be used when the vector or matrix is used in a vectorized way, and can vastly improve efficiency.

The Stan Math library, which Stan uses for automatic differentiation, has supported this new way of storing containers of parameters for a few versions now, but it has not been exposed to Stan users. With this release, Stan users be able to utilize this for models as well. This optimization can be turned on by using the --O1 stanc3 flag.

For more backstory see the design doc: https://github.com/stan-dev/design-docs/blob/master/designs/0005-static-matrices.md#summary

Automatic promotion of array of scalars

Users can now call functions that require array[] real with integer arrays - array[] int. The types are automatically promoted in the call to the function. Similarly, arrays of reals or integers can be used with functions that expect arrays of complex values.

Users need to take care when combining these pomotions with function overloading. More on that here.

Deprecations

Starting with this release, the Stan compiler issues warnings when using functions or features of the Stan language that are deprecated.
The warning also notes if the deprecated feature/function is scheduled to be removed and when that will occur.

Notable functions and features that will be removed in the next years January release (2.32 most likely):

  • The old array syntax. For example int a[5]; should be rewritten as array[5] int.
  • Using reserved words array, upper, lower, offset and multiplier. The use of these names as variable name will not be allowed in future versions.
  • Assignment with <- and commenting the code with #

Models can automatically be updated to use non-deprecated functions/feature by using the canonicalizer (see below for more).

Improved user-facing error and warning messages:

  • More informative error messages for ODE solvers.
  • When an unknown identifier is encountered Stan will suggest nearby known names you might have meant.
  • Improved error message when a user tries to declare a function argument as a constrained type.
  • Improved error message for incorrect variable declarations.

Improved auto-formatting and canonicalizer

  • Users can pass --max-line-length=# when auto-formatting to customize the line length.
  • Canonicalizer adds brackets around single state statements in if-else/for/while.
  • Modular canonicalizer: users can separately canonicalize for deprecations, braces, and/or parenthesis. Use

An online demo of the auto-formatter is currently available here and for the canonicalizer here. Some form of this demo will be available on the main Stan website in the future. Note that this demo compiles and formats everything locally, your model is not sent to a server.

Miscellaneous

  • Support for standalone function definitions - .stanfunctions

    The compiler can now compile or format standalone function definitions in a .stanfunctions file. These are compiled as if a normal Stan program was compiled with stanc3 --standalone-functions and can be used with #include statements in the functions block.

  • Upgraded Sundials to 6.0.0.

How to install?

Download the tar.gz file from the link above, extract it and use it the way you use any Cmdstan release. We also have an online Cmdstan guide available at CmdStan User’s Guide

If you are using cmdstanpy you can install the release candidate using

cmdstanpy.install_cmdstan(version='2.29.0-rc2')

With CmdStanR you can install the release candidate using

cmdstanr::install_cmdstan(version = "2.29.0-rc2", cores = 4)

And then select the RC with

cmdstanr::set_cmdstan_path(file.path(Sys.getenv("HOME"), ".cmdstanr", "cmdstan-2.29.0-rc2"))
15 Likes

Awesome, thanks to everyone who contributed!

Thanks for the updates!!

Early test issue: A model that just ran fine with 2.28.2 failed with the new candidate and --O1 optimization, giving this error at runtime:

Exception: fma: x ((45, 12)12, 1266) must match in size

The code at that location is somewhat complicated matrix multiplication (m1 * (m2 .* m3[idx,idy]) + m4), which I suspect shouldn’t actually be optimized via the fma function? I can work on a minimally reproducible case if that’s not enough info. Can also make a github issue?

*edit: clarify that the model compiled, and failure was at runtime

Thanks @rmcminds for reporting it. Does the same happen without —O1?

That would be great. You can make it in CmdStan and we can nove it elsewhere later.

the new release works fine without the --O1 flag. I’m creating a github issue now

1 Like
2 Likes

^ Just merged the fix for this, we should have another rc candidate available soon

3 Likes

I used the following in make/local

CXXFLAGS+= -O3 -mtune=native -march=native

with gcc 11.1.0, Manjaro Linux Kernel 5.15.16 and AMD Ryzen 5800H. The model (reduzed it to):

parameters {
  vector<lower=0>[2] tau;
}
model {
}
library(cmdstanr)
stan_data <- list()
n_chains <- 1
 
mod <- cmdstan_model("test2.29rc1.stan")
mod$compile(force_recompile = T)
fit <- mod$sample(
  data = stan_data
  , seed = 1
  , chains = n_chains
  , parallel_chains = n_chains
  , iter_warmup = 100
  , iter_sampling = 100
  , max_treedepth = 14
  , adapt_delta = 0.8
)

Running MCMC with 1 chain…

Chain 1 WARNING: There aren't enough warmup iterations to fit the 
Chain 1          three stages of adaptation as currently configured. 
Chain 1          Reducing each adaptation stage to 15%/75%/10% of 
Chain 1          the given number of warmup iterations: 
Chain 1            init_buffer = 15 
Chain 1            adapt_window = 75 
Chain 1            term_buffer = 10 
Chain 1 Iteration:   1 / 200 [  0%]  (Warmup) 
Chain 1 double free or corruption (out)
Warning: Chain 1 finished unexpectedly!

Warning message:
No chains finished successfully. Unable to retrieve the fit.

cmdstan 2.28.2 works fine. If I remove -march=native from CXXFLAGS everything works fine too.
Hesitated first to report it, since not many use the flag.

Try calling rebuild_cmdstan() after adding those flags to make/local, and then compile the model again - that fixed it for me

1 Like

We have released a second release candidate RC2 that addresses two bugs: one with function overloading and one with --O1 optimization and fma(). Thanks @rmcminds for reporting the issue and @stevebronder and @WardBrian for addressing them.

Edit: (Steve) the new RC candidate has been linked in the “How to install” section of the top post!

7 Likes

Many thanks to all the developers (I am not one of them, I am just a user)!

I do not know if stan 2.29 has this problem, but I noticed that stan 2.28 might have an error in the metadata function when returning num_chain. If num_chain is the number of chains used when fitting the model, then there is a small bug. Here is an example:

library(cmdstanr)

file <- file.path(cmdstan_path(), "examples", "bernoulli", "bernoulli.stan")
mod <- cmdstan_model(file)

data_list <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))

fit <- mod$sample(
  data = data_list, 
  seed = 123, 
  chains = 4, 
  parallel_chains = 4,
  refresh = 500
)

fit$metadata()$num_chain
fit$metadata()$id

The last two commands return:

> fit$metadata()$num_chain                                          
[1] 1
> fit$metadata()$id                                                 
[1] 1 2 3 4
>  

But I would expect fit$metadata()$num_chain to return 4

pinging @jonah and @rok_cesnovar - see above w/r/t fit$metadata()$num_chain when chains = 4

That is because num_chain denotes how many chains were run with a single executable.

If you run 4 CmdStan executables in parallel, the resulting CSV for all of them will store num_chain = 1.

@Amael What you are probably looking for is fit$num_chains(), the value of fit$metadata()$num_chain will depend on internal implementation - i.e. how the CmdStan executable was run.

4 Likes

@rok_cesnovar Ah yes exact! Many thanks for the clarification

1 Like

Some stats on this last release cycle:

There were a total of 170 pull requests merged across the Math/Stan/CmdStan/Stanc3/Docs repositories. Of those 170 pull requests, 28 were merged in Math, 10 in Stan, 10 in CmdStan, 42 in Docs and 80 in Stanc3. Compared to the last few releases this is considerabily less PRs in Math, but a ton more in Stanc3 than typically, which is also evident from the feature list.

22 different developers contributed code or documentation. Here is a list of how many PRs were merged for each developer:

WardBrian                       62
rok-cesnovar                    42
SteveBronder                    12
serban-nicusor-toptal           8
andrjohns                       6
adamhaber                       6
spinkney                        6
yizhang-yiz, nhuurre            5
jtimonen                        3
wds15, lyndond, mitzimorris     2
hsbadr, bob-carpenter, rybern, mandel, bbales, martinmodrak, maedoc, maltekiessling   1

Big shout out to @WardBrian for all the contributions!

There were 2 new developers with contributions: lyndond, maltekiessling. Always glad when I see new contributors around!

The number of downloads from Github for the last couple of releases:

2.28.2      19160 (+ around 1300 from Conda)
2.27.0      60151
2.26.1      28200
2.25.0      20009
5 Likes