Including source code + interfaces for Stan User Guide

Hello all,

Andrew G wanted a way to include more complete examples of Stan programs in the user guide. I think the idea is that a user can find the example program they want from the user guide and then have a R/Python program generate data and run the Stan program. The idea is that Stan examples will be easier to run from the most popular interface languages.

I have added the possibility that the Stan programs would also be pulled from disk and inserted into the Rmarkdown docs when the manual was converted to html and pdf. This will then allow for the manual code to be unit tested but I have not gone as far as figuring out how testing would work.

The current version is: 1.1 Linear Regression | Stan User’s Guide and https://mc-stan.org/docs/2_22/stan-users-guide-2_22.pdf (page 4)

NOTE: Links are now dead to html and pdf : My suggested rework is at: https://mc-stan.org/test/regression.html and https://mc-stan.org/test/regression.pdf. The rmarkdown source is at: https://github.com/stan-dev/stan-dev.github.io/blob/master/test/regression.Rmd

I have included 3 options in the html/pdf,

  1. link the files directly,
  2. footnote link to hyperlink to github repo folder
  3. link to repo folder

The structure of the users guide source is that each section is a standalone .Rmd file, so I suggest creating a directory with the Stan/Python/R programs with the same name. The individual examples have a numeric suffix in case the section has more than one stan file and I guess they could start over at one per directory. So for this example we have:

test/Regression.Rmd

There is a directory with the same name as the .Rmd file with the Stan/R/Python programs

test/regression/regression_1.stan
test/regression/regression_1.R
test/regression/regression_1.py

The above .Rmd is knit into

test/Regression.html
test/Regression.pdf

These files are pushed to the documentation repo–they are not built by Jekyll like the web site which is built with .Rmd files.

Request for feedback:

A) Using the same Rmarkdown source for pdf and html is awkward.

  • Hyperlinks are not clearly indicated in the .pdf. I think this can be fixed.

  • People might print the pdf which means that option 2, or something like it, is needed since the url to the git hub repo needs to be explicit. I don’t know how important this is but I wanted to raise the issue.

B) What do people prefer for the actual documentation references to source aesthetically?

C) Is the overall approach feasible for how we write documentation? Copying/pasting source is likely to generate errors. This keeps example code more likely to be current/correct especially if we start unit testing the code.

D) It is easy to pull chunks of code out as well. If there is interest I can come up with a proposal for that as well but I’d like to do a chapter or two first to see how workable it is. Are people, who write documentation, interested in this?

thanks

Breck

5 Likes

I prefer this. Easier to have it all together.

The directory name should reflect its contents, but I don’t think it needs to have the exact chapter name.

I’d prefer one model per directory to avoid confusion.

I’d prefer code in the manual to be separate from that on disk.

Anyway way we do this, every time we update code on disk we’ll need to double check the code in the manual.

So it’s not really automating the double checking away, and it just seems like extra complexity that might break.

If it takes more than a couple hours to figure out, I don’t think it’s worth worrying about.

Hi, I love this. I agree with Ben that the best is the link to the repo folder.

Just a few other comments:

Ben writes: “I’d prefer one model per directory to avoid confusion.”

My reply: By “model,” I assume you mean “Stan program”? I think that it will usually make sense to have one Stan program per directory. But there will be cases where it could make sense to have more than one, for example if we have multiple variants of a single statistical model. So I think we should be open about this and see how it goes as these things are written.

Regarding code in the manual and code in the directory: I have experience with this in the books that I’ve written. My suggestion is as follows: The code in the directory should run directly. We will try to keep the User’s Guide consistent with the code in the directory, but I think it would be too much trouble than it’s worth to try to enforce consistency. The code in the Users’ Guide is code fragments; it’s not runnable code. (Even in the cases in the User’s Guide where we have full Stan programs, these are not runnable because they don’t come with data and code to call the programs.) So I think that as authors of the User’s Guide we should do our best with the code fragments, but it’s the code in the directories that will be reliable.

I think these examples should use cmdstanPy and cmdstanR, rather than PyStan and RStan. So far we only have two examples so it should be easy to update them!

Finally, we’ll need a How to Do It page where we have instructions for running the Python or R code in these examples. Or maybe two How to Do It pages, one in Python and one in R.

1 Like

Good point.