~ and target += notations in Stan Modeling Language document

Someone asked what is the difference between ~ and target +=

I checked the Language model and I think people have problems finding the relevant sections 27.2 and 27.3 starting on page 293 (I first couldn’t believe that I should look beyond the first 100 pages). It seems it would be useful to have these before any model examples. Now model examples with ~ start appearing in Chapter 5 as part of indexing examples, and Chapter 6 is Regression models. target += appears first time in Section 7.6 HMMs on page 102, but there is no explanation what target += means.

Would it be possible to move chapter 27 “Statements” to be chapter 4? Chapter 29 “Program blocks” would also be useful to be before model examples.

Hi Aki,

Yes, if you have suggestions like that, put it under our issues:
https://github.com/stan-dev/stan/issues/2051

We’ll always have a running manual issue.

I guessed there might be discussion first and it would be added to issue when there is agreement.

What are we realistically planning to achieve by introducing the += syntax and not pushing it on users, at least via the documentation and examples?

I thought it was to emphasize that it is not a draw but the incremental addition of this one data point to the overall sum of the loglikelihood under the distributional assumption?

The manual is organized into a language reference and user
guide. The statements chapter is in the former, so I don’t want
to move it out of order from the rest of the reference.

The eventual goal is to separate the reference and the user’s guide
and replace the latter with case studies and the Stan book. But that’s
a long way off. So far, I’ve just been lazily adding things to
the manual without worrying too much about narrative order.

But having something earlier that explains how Stan works would
be good, including a description of target += and what the sampling
statement does and what local variables are and how parameters differ
from data and everything else that winds up being asked about frequently
on the Stan lists.

We do take pull requests!

  • Bob

Since this is would be a big change, before making an issue and a pull request, I show below my proposal for reordering Stan Modeling Language document. If there are no objections, then I’ll make a pull request.

In the proposal I’ve used the current chapter numbers, but ordered them in a better order + some chapters would be combined as there is some unnecessary repetition. It would be possible also to move Section II to be the last Section, but I think this order would work also.

I Introduction

  1. Overview
  2. Model Building as Software Development

II Stan Modeling Language
27. Statements
25. Data Types and Variable Declarations
+ 3. Data Types
+ 4. Containers: Arrays, Vectors, and Matrices
+ 5. Multiple Indexing and Range Indexing
26. Expressions
29. Program Blocks
28. User-Defined Functions
24. Execution of a Stan Program

III Example Models and Programming Techniques
6. Regression Models
7. Time-Series Models
8. Missing Data & Partially Known Parameters
9. Truncated or Censored Data
10. Finite Mixtures
11. Measurement Error and Meta-Analysis
12. Latent Discrete Parameters
13. Sparse and Ragged Data Structures
14. Clustering Models
15. Gaussian Processes
16. Directions, Rotations, and Hyperspheres
17. Reparameterization & Change of Variables
18. Custom Probability Functions
19. User-Defined Functions
20. Solving Differential Equations
??. New section on LOO and k-fold-CV
58. Transformations of Constrained Variables
21. Problematic Posteriors
22. Optimizing Stan Code for Efficiency
+ 4.3 Efficiency Considerations

VIII Algorithms & Implementations
56. Bayesian Data Analysis
57. Markov Chain Monte Carlo Sampling
60. Hamiltonian Monte Carlo Sampling
55. Point Estimation and Optimization Algorithms
59. Variational Inference (+ 62. Variational Inference)
63. Diagnostic Mode

IV Built-In Functions
31. Void Functions
32. Integer-Valued Basic Functions
33. Real-Valued Basic Functions
34. Array Operations
35. Matrix Operations
36. Sparse Matrix Operations
37. Mixed Operations
38. Ordinary Differential Equation Solvers

V Discrete Distributions
39. Conventions for Probability Functions
40. Binary Distributions
41. Bounded Discrete Distributions
42. Unbounded Discrete Distributions
43. Multivariate Discrete Distributions

VI Continuous Distributions
44. Unbounded Continuous Distributions
45. Positive Continuous Distributions
46. Non-negative Continuous Distributions
47. Positive Lower-Bounded Probabilities
48. Continuous Distributions on [0, 1]
49. Circular Distributions
50. Bounded Continuous Probabilities
51. Distributions over Unbounded Vectors
52. Simplex Distributions
53. Correlation Matrix Distributions
54. Covariance Matrix Distributions

IX Software Process
64. Software Development Lifecycle ???
23. Reproducibility

X Contributed Modules
65. Contributed Modules

Appendices
A. Licensing
B. Stan for Users of BUGS
30. Modeling Language Syntax BNF
C. Stan Program Style Guide
D. Warning and Error Messages
E. Deprecated Features
F. Mathematical Functions
Bibliography

@avehtari Looks good to me!

Wow, it took much more time than I expected. When doing reordering I noticed some repetition and reducing that required a bit more thinking. The final order of chapters is a bit different than in the proposal. Pull request https://github.com/stan-dev/stan/pull/2102

Aki

Thanks for tackling this. Part of the problem is that
the current manual is too large and not modularized well
enough.

That’s why I want to break it into pieces as outlined
in this issue:

https://github.com/stan-dev/stan/issues/2103

But let’s get this one merged for 2.13 first.

  • Bob