Personal Stan Guide

I’ve been a Stan user for a few years now, and one thing I’ve always struggled with is how to learn Stan. There is the documentation and the forums, and these are pretty great. But the documentation can be daunting and best suited for those who know for what they seek. So the the learning, for me, from these tend to be scatter-shot and it can be hard to get a clear systematic understanding of the software. I’d always wished there was a Intro to Stan book that could get somebody pretty far and even serve as a bridge to the more technical pieces of the Stan written resources. The best I’ve found is Ben Lambert’s Bayesian textbook. So first I ask, are there any resources about which I am unaware? Next, I thought I might build just such a guide for myself. Here is a rough outline:"

  1. Introduction to Stan

  2. Overview of Stan and its applications

  3. Installation and setup

  4. Integrating Stan with R: rstan and other interfaces

  5. Understanding Stan syntax and model structure

  6. Basics of Bayesian Statistics

  7. Brief review of Bayesian concepts

  8. Priors, likelihoods, and posteriors

  9. Introduction to inference and sampling methods

  10. Stan Model Structure

  11. Overview of model components

  12. functions {} block: Defining custom functions for model

  13. data {} block: Defining data inputs and types

  14. transformed data {} block: Data preprocessing and transformation

  15. parameters {} block: Defining parameters and their types

  16. transformed parameters {} block: Calculations involving parameters

  17. model {} block: Specifying the model and likelihood

  18. generated quantities {} block: Calculating derived quantities

  19. Data Types and Parameter Types

  20. Overview of data types: integers, reals, vectors, arrays, etc.

  21. Understanding parameter types: real, vector, simplex, etc.

  22. Choosing appropriate data and parameter types for models

  23. Declaring bounds on data and parameters

  24. Using Functions in Stan

  25. General Overview

  26. Broad Categories of Functions

  27. Probability Distributions and Related Functions

  28. Common probability distributions in Stan

  29. Defining distributions: syntax and parameters

  30. Functions for probability density, cumulative distribution, and random variates

  31. Custom distributions and functions in Stan

  32. Working with Stan Models

  33. Preparing data and running models

  34. Analyzing and interpreting results

  35. Debugging models and addressing common issues

  36. Best practices for efficient modeling and sampling

  37. Algorithms in Stan

  38. Markov Chain Monte Carlo (MCMC) and No-U-Turn Sampler (NUTS)

8. Basic concepts and use of MCMC/NUTS
9. Advantages and limitations of NUTS
  1. Variational Inference (VI)
8. Overview of variational inference
9. Practical use of VI in Stan models
10. Comparing VI with MCMC
  1. Optimization
8. Overview of optimization algorithms (MLE and MAP)
9. When to use optimization in Stan
10. Best practices for using optimization in Stan
  1. Common Modeling Scenarios

  2. Linear and logistic regression models

  3. Hierarchical models and multilevel modeling

  4. Time series and spatial models

  5. Troubleshooting Stan Models

  6. Common errors in Stan models and how to address them

  7. Diagnosing and resolving issues with sampling (e.g., divergences)

  8. Debugging variational inference and optimization

  9. Tips for improving model convergence and efficiency

  10. Advanced Topics and Model Building

  11. Custom functions and distributions

  12. Integrating Stan models with other software and languages

  13. Advanced modeling strategies and techniques

  14. Case Studies and Applications

  15. Real-world examples of Stan models across different fields

  16. Step-by-step guides for advanced model building

  17. Practical applications of Stan in research and industry

  18. Resources for Further Learning

  19. Recommended readings and resources

  20. Online tutorials and courses

  21. Stan forums and communities for support
    ".

Does anybody have any feedback about this outline? What items are missing? Out of order? Unnecessary?

3 Likes

I like the outline and I think it will be very useful to those starting in Stan. As much as I love the User’s Guide, it’s not easy to get started with, and I spent years in BUGS before transitioning to Stan.

Hi, @Nicklaus_Millican:

As your outlines hints, there are really two things going on here: teaching Bayesian statistics and teaching Stan (and even teaching programming). There are two “official” resources that might help here:

The Reference Manual is written for programmers and is rather dry and also not quite as precise as a formal specification. The User’s Guide is intended for people who already know Stan and already know Bayesian stats and want to know how to translate pieces of a model into Stan. It should have probably been called a “Programmer’s Guide”. As you point out, we don’t really have an intro and didn’t want to try to add too much tutorial material to either of the above docs. We really should put a short intro to the language into the User’s Guide. If you wanted to write that and submit it to Stan on GitHub (stan-dev/docs, written in Quarto), we could add it to the docs. We’re pretty picky in reviewing, but welcome contributions and will help with revisions.

CRC/Chapman & Hall keep asking us to write The Stan Book along the same line as their The BUGS Book. So if you really do fill out this overview, you have a clear route to publication! The closest I’ve come is this introduction:

I plan to go back and fill in more material. And maybe have someone translate to R :-).

@andrewgelman, @avehtari, and a host of others are working on a Bayesian workflow book that is essentially unfolding and trying to make sense of the paper:

There are a lot of online resources including videos and replicable case studies on the Stan site (both just listed under case studies and in the StanCon proceedings, which also have videos linked in most cases).

There are also several books that are more tutorials oriented. We have a list here:

Richard McElreath’s book is a great place to start if you’re new to Bayesian statistics and MCMC, and I believe there are matching videos online.

It’s woefully out of date, though I occasionally update it. There’s also an intro book by Bruno Nicenboim, Daniel Schad, and Shravan Vasishth. I reviewed a large chunk for the publisher, CRC, and it’s both really solid and tutorial oriented (a rare combination):