In section Coefficient prior and section Hyperpriors, I believe that parameter mu should not be indexed. Beta[j] should be distributed with multivariate normal distribution with location vector parameter mu (but not mu[j]).
Ah, I think the issue is that in the LaTeX rendering, u looks a lot like \mu, and to make things worse, the section later includes a version of the model code that defines a variable mu. Plus the final model re-structures u so that it can be directly (without index) multiplied by gamma for the computation of beta.
Technically the docs are correct, but bad variable naming that makes for high likelihood of misreading.
Mike, tnx for replying on such short notice. The final model is fine but I believe the first model, with \mu_j as a location parameter, is not formulated in a good way (although is technically correct). With \mu_j as a location vector with prior \mu_j ~ \mathcal{N}(0,5) there is no pooling at all in \beta_j estimation (i.e. each \beta_j can be estimated independently from other \beta_j s).
In my opinion (which could be wrong) more natural way of arriving at the final model is by first having \mu (without index) as a location parameter so that we have complete beta pooling, and then put the final (Gelman and Hill) model into play with location parameter u_j\gamma. In that way, in the first part of the document, we would be assuming that betas shrink toward the common location, \mu_j, and in the second part (final model) that \beta_j s shrink to the conditional mean u_j\gamma, given the covariates u_j.
So the overall added value in Stan doc would be consistency in the presentation of the model i.e. assumption of complete pooling in terms of betas (withOUT covariates in the first part, and with covariates in the second). Currently, the document assumes no pooling (independent \mu_j) and then complete conditional pooling (u_j\gamma).