Running large numbers of models/configs etc to explore a problem can easily get overwhelming. This problem is shared with creating/tuning machine learning systems and I am aware of at least one tool that attempts to help with it: ML Flow (https://mlflow.org/)
Some questions:
Are there other packages that people particularly like/don’t like etc…?
Any feedback on MLFlow appreciated. My initial experiments in a Databricks environment seem ok. Does anyone have experience with MLFlow?
I think @rybern is working on related issues of organizing a family of models for his dissertation.
I can never figure out what these web products do from their home pages. Their doc page is better, and I think their use of “lifecycle” corresponds to our use of “workflow”. It looks like some kind of web app for managing and sharing results. It says it’s application agnostic, so I’m curious how hard it was to integrate Stan into it and what you used it for.
Thanks @Bob_Carpenter, yes, I’m exploring an abstraction that should be helpful for organizing and automating model exploration. Andrew posted a video of it here (it’s a very bad video, I’ll replace it soon!)
You might also check out the infrastructure used by the SBC package. I also made an SBC framework of my own, using targets directly (after finding stantargets too limited at the time)
Funny timing, I just gave a talk on Stan + MLFlow at StanConnect Ecology part 1 – we’ve used this extensively over the past few months to great effect. I find it works well once you get things set up, and fits neatly in a Bayesian workflow/MLOps pipeline.
Edit: The compelling use case for me is that a Bayesian workflow involves a lot of experiments. Tracking experiments helps organize your work and more systematically see whether your development effort on a model is resulting in improvements. That said, diligent tracking of experiments is hard when it must be done manually. The value proposition of tools like MLFlow is it automates this tracking, which makes it easier to navigate the Bayesian workflow. As a nice side effect, MLFlow also provides a way to share results and deploy models more easily.
Gin provides a lightweight configuration framework for Python, based on dependency injection. Functions or classes can be decorated with @gin.configurable , allowing default parameter values to be supplied from a config file (or passed via the command line) using a simple but powerful syntax. This removes the need to define and maintain configuration objects (e.g. protos), or write boilerplate parameter plumbing and factory code, while often dramatically expanding a project’s flexibility and configurability.
Gin is particularly well suited for machine learning experiments (e.g. using TensorFlow), which tend to have many parameters, often nested in complex ways
To add some more thoughts based on our experience so far with MLFlow + Stan:
Pros
MLFlow is lightweight, doesn’t require major changes to existing code
MLFlow has both an R and Python API
Easily share model results by pointing people to a run’s URL
Integration with Azure Databricks is easy, but MLFlow does not lock you into using Azure
MLFlow works with any modeling framework (Stan, brms, lm, pytorch, random forests, scikit-learn, whatever)
Model registry is nice to have, as a way to promote an experimental model to production
Open source, and actively maintained
MLFlow has been around for a while, is on version 1+, and seems to be in a sweet spot as far as maturity + community + active development/maintenance
Cons
mlflow R package installation can be challenging, depending on your python environment hygiene (requires reticulate, and an mlflow conda environment)
Because MLFlow is so generic, it lacks some features that would be nice to have in a Bayesian setting (e.g., there’s no built-in support for comparing distributions from one experiment to the next).
I have not found the built-in visualizations in the web interface to be particularly useful
The R and Python APIs are not unified (e.g., it’s not like Earth Engine’s one-to-one mapping from javascript to python), which can be confusing and sometimes requires reading both language docs to figure out how to do things
I initially found the R documentation to be hard to follow
R API seems to be somewhat of a second-tier priority relative to the Python API (though this is probably a fair prioritization given the composition of MLFlow users)
I would love to see more built in support for Stan in particular and PPLs more generally - taking a look at the examples in their GitHub repo there’s definitely a focus on languagues/frameworks that are used more in a machine learning context (but see the Prophet example): mlflow/examples at master · mlflow/mlflow · GitHub