I am writing a template for cmdstanpy analyses called cookiecutter-cmdstanpy and would be really grateful for any feedback on the work done so far.
The overall idea is to avoid repeating work - instead of manually creating a bunch of folders and writing the same code every time you start a new cmdstanpy project, you can start with the template. Ideally, you would only have to make changes that reflect the parts of your analysis that are genuinely unique.
The template is based on cookiecutter, so you should be able to use it with the following commands:
pip install cookiecutter
cookiecutter https://github.com/teddygroves/cookiecutter-cmdstanpy
This should give you a file structure like this:
βββ LICENSE
βββ Makefile
βββ README.md
βββ bibliography.bib
βββ data
β βββ fake
β β βββ readme.md
β βββ prepared
β β βββ readme.md
β βββ raw
β βββ raw_measurements.csv
β βββ readme.md
βββ fit_fake_data.py
βββ fit_real_data.py
βββ prepare_data.py
βββ report.md
βββ requirements.txt
βββ results
β βββ infd
β β βββ readme.md
β βββ input_data_json
β β βββ readme.md
β βββ loo
β β βββ readme.md
β βββ plots
β β βββ readme.md
β βββ samples
β βββ readme.md
βββ src
βββ cmdstanpy_to_arviz.py
βββ data_preparation.py
βββ fake_data_generation.py
βββ fitting.py
βββ model_configuration.py
βββ model_configurations_to_try.py
βββ pandas_to_cmdstanpy.py
βββ readme.md
βββ stan
β βββ custom_functions.stan
β βββ model.stan
β βββ readme.md
βββ util.py
The repository and in particular the scripts prepare_data.py
, fit_fake_data.py
and fit_real_data.py
should work straight away, based on the stan file at src/stan/model.stan
and the raw measurements at data/raw/raw_measurements.csv
. You should also be able to run make clean_all
to get rid of created files and (if pandoc is available) make report.pdf
to create a pdf report.
In order to implement a custom analysis you need to modify or replace src/stan/model.stan
and data/raw/raw_measurements.csv
and then tweak the python files in the src
directory.
Iβd really like to know what some current or potential cmdstanpy users think of this project. In particular:
- Do you have a similar solution for avoiding repeated work when you start a new project?
- How well does the overall structure match how you use cmdstanpy?
- What do you think is the right balance between functionality and flexibility here? I tried to make the template as unopinionated as possible, but I donβt really know how other people tend to use cmdstanpy so I suspect it might be a bit too focused on my typical workflow.
- Are there specific choices you would have made differently?
- What is the biggest missing feature? I have listed a few here but I probably missed some.
Any help or opinions are very gratefully received!