CmdStan & Stan 2.37 release candidate

Edit (June 12):

A second release candidate has been published to address several issues with the embedded Laplace approximation found in the first release candidate.


I am happy to announce that the latest release candidates of CmdStan and Stan are now available on Github!

This release cycle brings the embedded Laplace approximation, a sum-to-zero matrix type, new functions, memory usage optimizations, and many other improvements.

You can find the release candidate for CmdStan here. Instructions for installing are given at the bottom of this post.

Please test the release candidate with your models and report if you experience any problems. We also kindly invite you to test the new features and provide feedback. If you feel some of the new features could be improved or should be changed before the release, please do not hesitate to comment.

The Stan development team appreciates your time and help in making Stan better.

If everything goes according to plan, the 2.37 version will be released in two weeks.

Below are some of the highlights of the new release.

The embedded Laplace approximation

We’re releasing a suite of functions to perform an embedded Laplace approximation, in a similar flavor to what is done in the INLA and TMB packages. These functions approximate the marginal likelihoods and conditional posteriors that arise in latent Gaussian models. The idea is to integrate out the latent Gaussian variables with a Laplace approximation and then perform standard inference on the hyperparameters. These functions give users a lot of flexibility when specifying a prior covariance matrix and a likelihood function, although the approximation is not guaranteed to work well for an arbitrary likelihood.

We’d like for people to fit their favorite latent Gaussian models with the embedded Laplace approximation, either using some of the built-in likelihoods or by specifying their own likelihood. We want to make sure the documentation provides enough guidance for users to write down their model.

Preliminary documentation for the embedded Laplace approximation is available here. These will continue to be updated during the release cycle, so check back!

Some additional materials are available here, though please note that the syntax has changed since these older materials were written.

Changes for constrained types

This release includes a new constrained type, sum_to_zero_matrix. This is similar to the existing sum_to_zero_vector type but the matrix will sum to zero across each row and column. See the preliminary documentation.

Additionally, the existing simplex and stochastic matrix types have been updated to use a new transformation under the hood, which should allow for more efficient sampling in some cases. See the preliminary documentation.

Built-in constraints exposed as functions

Building on the 2.36 release, which added the jacobian += statement, and _jacobian functions, this release adds functions that expose the implementations of Stan’s built-in constraints to the user.

Each of the built-in transforms now has three corresponding functions:

  • <transform>_constrain - applies the inverse (constraining) transform

  • <transform>_jacobian - applies the inverse (constraining) transform and increments the Jacobian adjustment term accordingly

  • <transform>_unconstrain - applies the (unconstraining) transform

The documentation for these new functions can be previewed here.

Other changes

The default number of significant figures saved by CmdStan has been increased from 6 to 8.

CmdStan now allows the user to specify filenames with a comma-separated list when requesting multiple chains.

The hypergeometric functions (hypergeometric_pFq, as well as specializations for 1F0, 2F1, and 3F2) are now available in the language. Preliminary documentation is available here.

This release fixed an issue where the laplace_sample functionality was using more memory than necessary, and implemented an improvement to the memory usage of the Pathfinder algorithm when PSIS sampling is not requested.

More details on all of the above and more are available in the preliminary release notes

How to install?

Download the tar.gz file from the link above, extract it and use it the way you use any Cmdstan release. We also have an online Cmdstan guide available at CmdStan User’s Guide .

If you are using cmdstanpy you can install the release candidate using


cmdstanpy.install_cmdstan(version='2.37.0-rc2')

With CmdStanR you can install the release candidate using


cmdstanr::install_cmdstan(version = "2.37.0-rc2", cores = 4)

Note: for the best experience, you may want to (re-)install your copy of cmdstanr/py from the latest development branches.

11 Likes

I for one welcome our new laplace and constraint overlords…er I mean functions

8 Likes

5 posts were split to a new topic: Differences between jacobian += and target+=?

Thanks for this great release everyone! Your hard work is much appreciated. The new simplex transform reminded me of the log Dirichlet distribution that @spinkney posted here once. Are there any plans of including that in Stan at some point? I am working on some models where I never have to leave the log scale for the simplex.

Hi!

I have played a bit with the laplace approximation, which looks like a huge feature. There is one thing I ran into, which I find odd: There are specialzed functions like

laplace_marginal_poisson_log_lupmf and the like

However, with the general interface it is not possible to pass in the general case using laplace_marginal a likelihood_function function which itself calls a lupmf function (so unnormalized pmfs). The reason is that the very first argument of the likelihood_function must be a real vector which represents the parameters we like to integrate out. However, if the likelihood_function is an unnormalized thing, then I have to call it (as enforced by the parser) to be a lpmf function…and those functions are only allowed to have as first argument an int argument!

In short, the unnormalized densities for discrete distributions cannot be used inside the likelihood_function… but calculating normalization constants is really costly.

Maybe I overlooked something?

Wouldn’t it make more sense to make the vector of parameters to integrate out to be the last argument? Then one could still put into the likelihood function the data to be the first argument as usual…

Best,
Sebastian

This is not really the issue, because you can call a _lupmf from inside a lpdf, so the vector as the first argument is not preventing you. But, if you try this, you’ll find that the parser will instead give you an error like

Function 'll_function_lupdf' does not have a valid signature for use in 'laplace_marginal_tol':
Expected a pure function but got a probability density function.

On a technical level, we could relax this restriction (we’ve already worked out how, in the reduce_sum code). On a mathematical one, I’m not sure if the approximation assumes the density is normalized or not, which would be a question for @charlesm93

Release Candidate 2

Please note that earlier this morning we released an rc2, which resolves several compilation issues and a major runtime issue with the embedded Laplace approximation. If you have been trying out rc1, we highly encourage you to update and try rc2!

1 Like