Cookiecutter-cmdstanpy: a template for cmdstanpy projects

I am writing a template for cmdstanpy analyses called cookiecutter-cmdstanpy and would be really grateful for any feedback on the work done so far.

The overall idea is to avoid repeating work - instead of manually creating a bunch of folders and writing the same code every time you start a new cmdstanpy project, you can start with the template. Ideally, you would only have to make changes that reflect the parts of your analysis that are genuinely unique.

The template is based on cookiecutter, so you should be able to use it with the following commands:

pip install cookiecutter
cookiecutter https://github.com/teddygroves/cookiecutter-cmdstanpy

This should give you a file structure like this:

β”œβ”€β”€ LICENSE
β”œβ”€β”€ Makefile
β”œβ”€β”€ README.md
β”œβ”€β”€ bibliography.bib
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ fake
β”‚   β”‚   └── readme.md
β”‚   β”œβ”€β”€ prepared
β”‚   β”‚   └── readme.md
β”‚   └── raw
β”‚       β”œβ”€β”€ raw_measurements.csv
β”‚       └── readme.md
β”œβ”€β”€ fit_fake_data.py
β”œβ”€β”€ fit_real_data.py
β”œβ”€β”€ prepare_data.py
β”œβ”€β”€ report.md
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ results
β”‚   β”œβ”€β”€ infd
β”‚   β”‚   └── readme.md
β”‚   β”œβ”€β”€ input_data_json
β”‚   β”‚   └── readme.md
β”‚   β”œβ”€β”€ loo
β”‚   β”‚   └── readme.md
β”‚   β”œβ”€β”€ plots
β”‚   β”‚   └── readme.md
β”‚   └── samples
β”‚       └── readme.md
└── src
    β”œβ”€β”€ cmdstanpy_to_arviz.py
    β”œβ”€β”€ data_preparation.py
    β”œβ”€β”€ fake_data_generation.py
    β”œβ”€β”€ fitting.py
    β”œβ”€β”€ model_configuration.py
    β”œβ”€β”€ model_configurations_to_try.py
    β”œβ”€β”€ pandas_to_cmdstanpy.py
    β”œβ”€β”€ readme.md
    β”œβ”€β”€ stan
    β”‚   β”œβ”€β”€ custom_functions.stan
    β”‚   β”œβ”€β”€ model.stan
    β”‚   └── readme.md
    └── util.py

The repository and in particular the scripts prepare_data.py, fit_fake_data.py and fit_real_data.py should work straight away, based on the stan file at src/stan/model.stan and the raw measurements at data/raw/raw_measurements.csv. You should also be able to run make clean_all to get rid of created files and (if pandoc is available) make report.pdf to create a pdf report.

In order to implement a custom analysis you need to modify or replace src/stan/model.stan and data/raw/raw_measurements.csv and then tweak the python files in the src directory.

I’d really like to know what some current or potential cmdstanpy users think of this project. In particular:

  • Do you have a similar solution for avoiding repeated work when you start a new project?
  • How well does the overall structure match how you use cmdstanpy?
  • What do you think is the right balance between functionality and flexibility here? I tried to make the template as unopinionated as possible, but I don’t really know how other people tend to use cmdstanpy so I suspect it might be a bit too focused on my typical workflow.
  • Are there specific choices you would have made differently?
  • What is the biggest missing feature? I have listed a few here but I probably missed some.

Any help or opinions are very gratefully received!

10 Likes

hi Teddy,

totally sympathetic to this idea.
two potential problems:
a) every analysis is just different enough to be problematic
b) keeping the template simple enough to be immediately grok-able

regarding arviz - cmdstanpy to arviz should work better after recent refactor.

also, I’ve got an example of distributing cmdstanpy jobs over slurm - would this be something to add to this?

cheers,
Mitzi

2 Likes

I think json might be better format for data (if there are ndim arrays).

2 Likes

FYI, I’ve hade some issues, like divergent transitions not being displayed in pair plots and some issues (might have been on my side) with array orders (column v row major). Haven’t gotten around to filing a bug report or what not.

1 Like

Yeah that is something to think about - maybe the wizard should ask the user if they would rather input csv or json and then structure the repository accordingly.

1 Like

That’s a big challenge - I was surprised by what abstract code I ended up writing (and implicitly asking any user to understand) just in order to allow optionally varying the Stan program, priors and covariates. I’m fairly sure there’s room for improvement there but I guess there are also some things it’s just not worth making a template for. My intuition is that I do enough repeated work that some kind of template must be worthwhile but it’s hard to tell in advance what would be the most useful.

Definitely!

I’ve found the best approach to be to subclass cmdstanpys classes. But the again I wanted more than what cmdstanpy currently provides.

subclass cmdstanpys classes

cool! just curious - what did you want and which classes did you subclass?

Hm, only CmdStanModel and CmdStanMCMC and just for a few things, mainly for convenience. Except for the following:

  • I wanted to (semi-)gracefully be able to kill runs/chains. Whatever else I did always resulted in the chains staying running, messing up my timings. The solution was to reimplement (copy&paste) CmdStanMCMC's sample method, grabbing the ThreadPoolExecutor instance and the subprocesses, passing them to the subclassed CmdStanMCMC instance which then had all the information to terminate rogue chains, and to work with the partial results.
  • I wanted to be able to pool all warmup samples (from the last metric-window) and compute the covariance matrix. This is definitely a non-portable hack, because it depends on all parameters being unconstrained (the metric works on the unconstrained values, while stan only gives me the constrained values).

Other than that, I wanted the following things for convenience. I only looked at the source files for the above two classes, so I might have just missed some stuff:

  • Be able to work with fit objects where only some of the chains finished (see above).
  • Get per chain timings. Did I miss some functionality?
  • Collapse the stan summary to a boolean (i.e. everything alright or not).
  • Sometimes the ordering of arrays of vectors didn’t seem to be right. But I’m not sure anymore.
  • Randomly pick a draw to initialize a new run.
  • Inspect the source code and get functions/model blocks.
  • Restart sampling from a fit object.
  • Do some weird incremental warmup.
  • Addendum: I wanted the model to be able to infer some of the data from other parts of the data, like for example the length of an array.
  • Also: being able to sample from the prior.

Final edit:

  • Also, the sample function does not seem to play nice with numpys ndarray for the stepsize, and I think I could not simply pass multiple metrics or inits
1 Like

ArviZ might work in this case (tbh, there could be errors too)

I think these can be accessed from the stdout files?

Maybe ArviZ summary might be easier to work with?

Was this after you combined multiple chains? See arviz.InferenceData output, does that look correct?

I think a simple function could work in this case? (It might be easier to get inits from arviz.InferenceData)

(source β†’ cpp?)

Was there any problems?

In Stan code?

? (fixed_param + a custom .stan is probably easiest way to do this)

I think I already fixed these issues. But if this is not yet working, then we need to fix it.

no, you didn’t miss that - it’s not scraped out of the Stan csv file; that could be done and added to the RunSet object.

Ari’s correct that it’s also in the stdout files.

what do you mean by β€œrestart sampling”?

  • continue adaptation? in which case, initializing with stepsize and mass matrix should work.
  • get more samples? initializing with stepsize and mass matrix, no_warmup, will produce more samples. but there’s a problem in that we don’t save the state of the RNG - the seed + chain_id is the initial offset, but then during a run, we record how many times it’s called. to properly continue, one would need to know both the seed and how far to advance the RNG.
1 Like

we recommend running multiple chains in order to be able to test convergence. because if you don’t have convergence, whatever your sample is, you have no evidence that you have a set of draws from the posterior, consequently, no reason to believe that the samples are valid.

therefore, you need to wait until all chains have finished sampling. granted, this is problematic if you’re not able to run all chains in parallel or otherwise distribute them across a cluster.

1 Like

I thought I’d post an update as I think this project is now more or less usable - at least I’ve found it useful for my own cmdstanpy projects. If anyone is looking for a cmdstanpy template, please try it out and let me know your opinions - I’d really appreciate any feedback.

You can find the project here and fetch it from the command line like this:

pip install cookiecutter
cookiecutter gh:teddygroves/cookiecutter-cmdstanpy

I made quite a few changes with the general aim of being flexible enough to allow arbitrary models, configuration
and data processing while keeping things as simple and intuitive as possible. The readme now contains a lot more explanation and some detailed examples.

Here is the new project structure (the writing directory is optional)

.
β”œβ”€β”€ LICENSE
β”œβ”€β”€ Makefile
β”œβ”€β”€ README.md
β”œβ”€β”€ analyse.py
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ prepared
β”‚   β”‚   └── readme.md
β”‚   └── raw
β”‚       β”œβ”€β”€ raw_measurements.csv
β”‚       └── readme.md
β”œβ”€β”€ model_configurations
β”‚   β”œβ”€β”€ interaction.toml
β”‚   β”œβ”€β”€ interaction_fake_data.toml
β”‚   └── no_interaction.toml
β”œβ”€β”€ prepare_data.py
β”œβ”€β”€ pyproject.toml
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ results
β”‚   └── runs
β”‚       └── readme.md
β”œβ”€β”€ sample.py
β”œβ”€β”€ src
β”‚   β”œβ”€β”€ data_preparation.py
β”‚   β”œβ”€β”€ model_configuration.py
β”‚   β”œβ”€β”€ prepared_data.py
β”‚   β”œβ”€β”€ readme.md
β”‚   β”œβ”€β”€ sampling.py
β”‚   β”œβ”€β”€ stan
β”‚   β”‚   β”œβ”€β”€ custom_functions.stan
β”‚   β”‚   β”œβ”€β”€ model.stan
β”‚   β”‚   └── readme.md
β”‚   └── util.py
└── writing
    β”œβ”€β”€ bibliography.bib
    β”œβ”€β”€ img
    β”‚   β”œβ”€β”€ example.png
    β”‚   └── readme.md
    └── report.md

The template comes with a working example model - you should be able to try it out with make analysis straight away. This makes it possible to edit the template incrementally so that it implements your analysis, with the possibility to re-run at any time in order to check your work.

The only abstractions are now ModelConfiguration and PreparedData. A ModelConfiguration is a representation of a toml file with instructions for which modes to run in (choices are currently prior, posterior and k-fold cross validatation), where to find a Stan program and prepared data and how to run cmdstanpy (including stanc_options, cpp_options and keyword arguments for CmdStanModel.sample). There are a few examples in the folder model_configurations. PreparedData is a a definition of what prepared data should look like.

By default the results are converted to arviz InferenceData and saved in netcdf format to results/runs/<model configuration name>/mode.nc.

It’s now possible for the raw data to have any format with a little editing . The readme goes into some examples of how to do this.

I was a little unsure at first whether a cmdstanpy (or more generally statistical analysis) template was a viable project, mainly because as @mitzimorris says above the goals of being both flexible and easy to use and understand kind of pull in opposite directions. However I’m now using the template for any project that threatens to get bigger than a few files and anecdotally it definitely feels like an improvement on what I was doing before. I also noticed that project templates seem to be very popular in the deep learning world, e.g. this tensorflow_template_application and this pytorch-template have 1.9k and 3.2k github stars respectively.

I’m also curious about whether there are any other statistical analysis templates out there - if anyone knows one please post it!

8 Likes

I’ve done some poking around without finding anything really fleshed out. I’ve been wanting to make a cookiecutter or other template repo for CmdStanPy but for a slightly different purpose, for packages that want to use cmdstanpy/Stan to do pre-baked kinds of analysis (along the lines of Facebook Prophet). There are lots of useful tricks to know about compiling models on install that would be great to spread around.

2 Likes

That sounds like a really nice project. I can also imagine templates for the other Stan interfaces being very useful.

Maybe at some point it might make sense to add a templates section to the Stan documentation page. Before then it probably makes sense to try and form a consensus about whether and for what kinds of project it’s a good idea to recommend using templates and what things a template should and shouldn’t cover. I’m not really sure what the right process is here - maybe write a template sepecifications design doc, maybe just encourage experienced people to write and try out templates?

Thanks, @Teddy_Groves1

@mitzimorris: Would it make sense to put links to these cookiecutter templates with CmdStanPy doc? I’d prefer to reduce clutter on our top-level web pages.

As I said in the Stan meeting, I think it’d make sense to have a range of these cookie cutters depending on goal.

Here’s the Molecular Dynamics cookie cutter I mentioned:

GitHub - MolSSI/cookiecutter-cms: Python-centric Cookiecutter for Molecular Computational Chemistry Packages.

It’s a lot more elaborate in that it contains testing, doc, and continuous integration hooks, all of which would be nice for something like a group of people working on a package.

The other question is whether this should all link in Jupyter notebooks. I was under the impression that Jupyter is the way a lot of people interact with Python (somewhere between RStudio and knitr).

1 Like

I think it would make a lot of sense to add a community/downstream section of the CmdStanPy doc which links to things like this and also fbprophet, etc

2 Likes

and Arviz!

Thanks a lot for the thoughts and link - I’ve now renamed the template β€œcookiecutter-cmdstanpy-analysis” and added optional CI, tests and Sphinx docs with working examples.

I think one or more Jupyter notebooks might be a good idea for a model evaluation example, supplementing or replacing the existing analyse.py script, and/or as an alternative documentation option alongside markdown and Sphinx. I personally tend to just use python scripts for evaluation and markdown for documentation as I’m always using them already and can’t be bothered to set up Jupyter as well, but maybe that is just because I haven’t found the right template yet!

1 Like