I realized after a lot of people asked me what to read to get started with Stan that there’s a pretty big gap between our simple hello-world examples for the interfaces and our full-blown User’s Guide which is not at all hands-on.
I’m curious what people do now when someone asks them how to code in Stan. Are there other resources along the lines of what I wrote here? (In retrospect, I should’ve hopped on the forums and asked that before spending a week writing this case study.)
Getting started
To fill the gap, I drafted the following document.
- Bob Carpenter. 2023. Getting started with Stan in Python with cmdstanpy and plotnine.
[edit: Replaced download link with proper web link. ChatGPT even gave me step-by-step instructions on how to configure GitHub pages to serve this as a web page from my GitHub repo. I also spellchecked and was a bit more careful with formatting, though I’m curious how it looks on different OSes, which have different code fonts.]
This is just a draft, so I’d be very happy to get feedback.
100% Python
The case study is indeed pure Python. I’m still learning Python, so any help there would be appreciated. plotnine
is a direct call-for-call translation of ggplot
based on pandas
data frames.
If people like this tutorial, we could easily translate it to R and Julia.
ChatGPT knows pandas and plotnine
The real breakthrough is that I wrote this entire tutorial without looking at a single piece of Python or pandas or plotnine documentation—I just asked ChatGPT (3.5 and 4) how to render all the graphs and do all the data frame manipulations I usually struggle with. I do know the basics of the grammar of graphics (the “gg” in ggplot), but can never remember the function names or arguments. So this is a godsend for working in R, too. For example, I did things like this:
I have a pandas data frame in Python and I would like to pull out the first 10 entries of the column
draws
for values where the columnrho
is 0.05, is 0.5, and is 0.95. How do I do that?
And ChatGPT gives me working code I can cut-and-paste in a convenient cut-and-paste box.
import pandas as pd
# Assuming you have a pandas data frame called 'df'
# Filter the data frame for rho == 0.05, and get the first 10 draws
draws_rho_005 = df.query('rho == 0.05').head(10)['draws']
# Filter the data frame for rho == 0.5, and get the first 10 draws
draws_rho_05 = df.query('rho == 0.5').head(10)['draws']
# Filter the data frame for rho == 0.95, and get the first 10 draws
draws_rho_095 = df.query('rho == 0.95').head(10)['draws']
It’s even useful for quarto (and I no longer worry about awkward grammar in a query).
how do I write a bibtex entry for an open source software with a url
To which ChatGPT responds with this example.
@misc{software_key,
author = {Author Name},
title = {Software Title},
year = {Publication Year},
version = {Software Version},
howpublished = {Open Source Software},
url = {https://www.example.com/software-url}
}
ChatGPT knows a bit about Stan, but it’s not nearly as good at Stan as it is at Python.