All,
We are in the process of developing a MOOC (Massive Online Open Course) for Intro to Stan. This may be a revenue generation process for us, maybe a free thing, we don’t know yet. In any case there will always be scholarships.
Given the work of doing a MOOC, a lot, I want to have a target that specifies what the MOOC is teaching towards. I’d like it to be a certification and that is very likely to be a paid certification at basic Stan mastery. Again, there will always be scholarships.
Below is a draft Stan 1 certification specification that I threw together. It defines what it means to be a basic Stan programmer that can interface with statisticians and carry out Bayesian Modeling. It is much more about working in Stan than statistics.
Feed back appreciated. There is a Google doc with the draft at: (https://docs.google.com/document/d/1RmeoieDUcq0YFIUW_PNDO_5aixXQgfwG0ZLj3Paj64Y/edit?usp=sharing). Edit/comment as you see fit. Pasted below to save you a click, you can comment in this thread as well and I’ll try and integrate.
This document is a publicly shared specification for what a professional Stan programmer should know and be able to do. This is meant to be the lowest level certification.
At this point we want overall items to consider.
Level 1 Stan Certification Objectives:
The mechanics of Stan programs and how they operate
 Stan installation
 Install CmdStan, python interface, R interface, relevant IDEs
 Know of cloud based resources
 Knows how to handle 3 most common install problems
 Data ingest
 Fluent in Stan data{} block with input from R, Python, JSON in interfaces and CmdStan. Reading data and saving outputs
 Can debug top 5 common errors on data ingest.
 Data manipulation
 Demonstrate data munging skills in the transformed data{} block.
 Can debug 3 most common errors on data transformation.
 Parameter Mastery
 Set up multiple parameters for estimation
 Understand role of <upper=,lower=>
 Work with vectorized parameters.
 Know execution order and when it happens for all Stan blocks
 User defined functions
 Reimplement normal/exp/uniform dists as user defined functions
 Fluency with various scales used in Stan (log, decimal)
 Familiar with over/underflow.
 Ability to correct over/underflow in common situations.
 Convert to log scale
 Change priors or upper/lower constraints
 Running cmdStan and invocation options. Knowledge of all algos and appropriate use cases.
 Optimize
 Sample
 method=sample algorithm=fixed_param num_samples=1
 no warmup
 Knowledge of how to use generated quantities {}
 Use in prediction
 Posterior checks
 Debugging Stan
 Print statements
Basic Bayesian Modeling
For all modeling tasks the test taker should show when appropriate:

Prior predictive checks

Motivate choice of prior

Motivate choice of likelihood

Demonstrate knowledge of runtime diagnostics

Demonstrate knowledge of posterior diagnostics

Demonstrate posterior checks

?? more bayesian workflow ??

Show predictive interpretation

Discrete data
 Code coin flip/baseball ab test with
 Total pooling
 No pooling
 Partial pooling
 Code multivariate case
 Code naive Bayes classifier
 Code a regression model
 Hierarchical models
 Code coin flip/baseball ab test with

Continuous data
 Normal data set e.g. height
 ……

Knowledge of core distribution

Knowledge of how to use ODE solver

Visualising the fit
Communication
Bayesian modeling requires specialized techniques to communicate with nonpractitioners. From David Spieglehalter’s suggestions:
Motivating Priors
Conveying results to nonBayesians
Common Stan Gotchyas
Can’t send an array of length 1 to Stan from RStan