As a new user of Stan, I am trying to understand how the order of statements in the model block affects the model being fitted.
(1) Do you know of a place (site) where I can see examples of correct / incorrect statement orderings, as well as a discussion of why the incorrect order is wrong? I’d like to understand how the statements in the model block translate to C++ code (e.g.) in order to carry out the MCMC, so I’m not simply guessing at a correct ordering of statements.
(2) E.g., in the example in the Stan reference manual for “fitting a Gaussian process”, my guess was that the priors for eta_sq, rho_sq, and sigma_sq would come before the definition of Sigma, but the example shows the defn of Sigma coming first. Why is this? Or does the order even matter in this example?
Statements are executed sequentially (as in R, Python, and almost any language except BUGS).
Using a variable before it is defined is a common mistake that will usually result in NaN (not a number) being propagated through all the calculations.
The model block builds up an expression for the kernel of the log posterior. Since addition is commutative, it does not matter what order you add the log-density terms in.
1 Like
+1 to all of what @bgoodri said. Just adding a little more clarification.
Stan is an imperative programming language. BUGS and JAGS are declarative.
In theory, this is true. In practice, this is usually true. The only time this doesn’t hold is when the floating point arithmetic for the particular model isn’t commutative (see stackoverflow discussion for more details).
On a given platform, the compiled Stan program will run identically with the same seed. If you change the Stan program’s order of adding the log density terms, it should run identically, but if there’s any difference in floating point (down to even 1e-12
absolute), then the draws will be different given the same seed.
1 Like
Re 2,
eta_sq ~ cauchy(0, 5);
Translates to:
target += cauchy_lpdf(eta_sq | 0, 5)
Basically ~
is not an assignment (eta_sq is not being assigned a value by the sampling statement, which is why it’s okay this happens after Sigma is built). Check section 4.3 for details on the ~
vs target +=
syntax (target is a special value that holds the accumulated log probability of the model).
The reason the code works like this is eta_sq/rho_sq/sigma_sq already had values before that block started executing. The values of the parameters define the state of the MCMC trajectory – the model block is just used to evaluate the log probability [and gradient of the log probability] at any particular state.
That make sense?
Chapter 5 (Program blocks) has more details on how all the blocks fit together.
Non-commutativity of floating point is unlikely to be a problem in any of our models. It can lead to different behavior as Daniel says (even switching OS or compiler can lead to different behavior because floating point is a moving target within some precision constraints).
The key thing to understand is that the data is all defined when you hit the transformed data block, then it’s just a regular program. Similarly, parameters are all defined when you hit the transformed parameter block (and everything from previous blocks is defined once you get to the model block). Then because the ~
statements just add to the log density, it doesn’t matter what order they come in. But otherwise, if you’re defining local variables, order matters.