Hello, Stan community!
I’d like to make you aware of a new tool we’re developing here at Flatiron: MCMC Monitor. It allows monitoring of Stan runs, both in progress and after completion. Consider it a ShinyStan that also works in an online context, as chains evolve.
More information at the Github readme: GitHub - flatironinstitute/mcmc-monitor: Monitor MCMC runs in the browser
Installation is straightforward: all you need to do is set up a lightweight service on the machine that’s running the Stan program (so that it has access to the Stan process’ output directory). Run your model, and you can view the program output via any web browser. The tool offers visualization both locally (from the same machine) and, with appropriate configuration, remotely over the Internet.
The headline display is per-chain plots of variables-of-interest on the fly, like the following:
The tool also offers tabular data (including overall effective sample size and R-Hat), plots of autocorrelation and histograms, and more. I would particularly highlight the ability to plot different variables against each other in pairs (2d plots) or triplets (3d plots).
For a quick look at the tool against some small example runs, see here: MCMC Monitor
MCMC Monitor is still in beta and undergoing active development, and we would welcome any feedback you have, either through issues on our Github or by responding to this post.
This is super cool. I want to add that the devs, Jeff Soules (@JSoules, OP) and Jeremy Magland (@magland), are super responsive and just kept implementing features @WardBrian and I asked for until they had a pretty much complete ShinyStan version but for draws as they are being produced (or after they have already been produced). In other words, they were serious saying they would welcome feedback either here or on GitHub.
The other thing I want to point out is that there’s nothing at all Stan specific about MCMC Monitor. It only assumes draws are being produced by appending to CSV files.
I should also point out that this doesn’t have the latest implementation of ESS or R-hat—it’s just the traditional ESS monitor independent by chain and the original R-hat definition. We’re hoping to update the implementation they are using in Bayes Kit, our project to build out simple plug-and-play MCMC tools in Python that are agnostic as to how densities (and optionally gradients and Hessians) are coded.
Thanks for sharing this. I must say that it is not clear to me what the applications are for such a tool. I always try to work things out such that fitting is fast or I throw things on a cluster with lots of resources so that things are done in the background for me. In a modelling exercise I am quite impatient about making progress and waiting for the sampler is the least thing I like to do (but I have to). Instead of watching my chains do work, I’d prefer coffee or think about the next model.
I do not want to be impolite here at all… there would be one application of such an online monitoring tool from my view which I’d consider helpful: If the model did not converge sufficiently fast within xy iterations, then I would like to stop wasting my resources and abort the fit. A model not converging reasonably fast (highly depending on the context) is often a sign of an error in coding, an error in model setup due to non-identifiability or a combination of that. So what about an auto kill feature of the sampler whenever some user specified limits are not occurring?
Other applications I see are debugging performance if that is possible. So do certain parameter draws kill the ODE solver, for example. Would this be possible? This would require to measure the time from draw-to-draw and then filter out draws which took long to get (still not ideal as we don’t get to “see” all the tried parameters in the final MCMC output from NUTS).
Thanks @wds15. You’ve highlighted some important considerations.
We seem to agree on the first use-case: the ability to terminate a simulation that isn’t converging within a reasonable time frame. This is indeed one of the key applications of MCMC Monitor. It’s designed to help users decide if they should stop the sampler early, particularly if the chains don’t appear to be converging quickly enough or if the iterations are taking longer than expected. This would enable users to save on resources and time.
Another benefit of MCMC Monitor is that it can provide insights about the results before the sampling process is complete. This could potentially give you a head-start on planning your next modeling task or tweaking the current one, letting you make the most of your time while you wait.
Your idea of having a criteria for stopping early is a good one. While the current scope of MCMC Monitor might not extend to initiating the termination (since it’s primarily a monitoring tool), it could certainly provide the information necessary for a larger framework that could automatically stop the process early.
Finally, MCMC Monitor can also be used for exploring the output even after the run has completed. This can serve as an alternative or complement to other tools such as Shinystan, providing flexibility in post-modeling analysis.
(As for the suggestion about debugging performance and its relation to specific parameter draws, I agree that this could be an excellent addition to the tool. Perhaps others have an idea about how to move forward with that.)