Hi Stan community,
Colleagues and I over at Tweag created a web service called Chainsail that can drastically improve sampling of multimodal distributions, which occur often in models with unidentifiable parameters or when you have ambiguous data. Chainsail has flexible support for models and probability distributions defined in PyMC, Stan, or hand-written Python.
The secret sauce in Chainsail is an autotuning Replica Exchange algorithm that uses cloud computing to scale dynamically beyond the computing resources available on single machines. If you want to learn more about Replica Exchange, here’s a shameless plug: I wrote a blog post about it.
We’re currently looking for probabilistic programming practitioners who might be interested in beta-testing it. If you have multimodal distributions to sample, shoot us an email to get your email address authorized. The email address is below. Don’t hesitate either to tell us a bit about your sampling problem or ask us for a demo - we would be happy to chat and show you around. Currently, Chainsail is deployed on Tweag’s premises and is available to beta testers for free (within reasonable computing time limits).
Future Chainsail development depends mostly on beta tester feedback, but faster Stan support, a better HMC implementation (possibly using BlackJAX) and more choices for the tempering schemes (for example, applying a temperature only to the likelihood) would be among the next things to work on.
Chainsail is currently closed-source, but it is highly likely that we will eventually make at least parts of the service, if not all of it, open-source.
If you’d like to learn more about Chainsail, here’s a couple of additional resources:
- we shot a ~15 minutes walkthrough video that demonstrates how to use Chainsail,
- we wrote an announcement blog post that presents Chainsail and the kind of problems it solves,
- we have a repository that provides more detailed documentation and example probability distributions. It also hosts the
chainsail-helpersPython package that provides probability distribution interfaces for PyMC and Stan and a helper script to process the downloaded sampling results: GitHub - tweag/chainsail-resources: Examples, documentation and other additional resources related to Chainsail,
- and, in a couple of days, we will publish another blog post in which we use Chainsail to analyze a soft k-means model / fitting a Gaussian mixture with unequal weights (in one dimension for easy visualization) and show how using Replica Exchange via Chainsail makes a difference.
Chainsail is in an early stage and currently has a couple major limitations:
- the big limitation right off: sampling a model defined in Stan works, but is very, very, very slow,
- it uses only a very basic, untuned HMC implementation (no NUTS, fixed and preset number of integration steps, simple heuristic to tune the timestep),
- only the most important parameters can currently be set by the user,
- it’s in its infancy, so expect glitches and rough edges :-)
If you have any questions about Replica Exchange, what the Chainsail service can and cannot do and if you’d like to test it, please don’t hesitate to let me know in this thread or email us: firstname.lastname@example.org. Looking forward to hearing your questions, opinions and ideas!