Best Practices for Writing Multilevel Model in STAN

fausto.siegmund · October 27, 2017, 4:45pm

I am being introduced to Stan through Richard McElreath’s book, Statistical Rethinking, and am also trying to learn how to write model code directly in .stan files. The example models here https://github.com/stan-dev/example-models/wiki have been very helpful.

Is there a current ‘best practices’ approach to writing multilevel models? For example, is there a best way to write a linear or logistic model with multiple, nested group-level factors that is both (1) fast and (2) legible to someone who is used to writing JAGS or BUGS models? I am asking because I am new to these languages, and would like code I write to be efficient and readable by those who do not use Stan. I’m sure the answer is case-specific, the more so as models become complex.

bgoodri · October 27, 2017, 5:22pm

Not a universal one. In most cases, it is better to write the non-centered version of a multilevel model but in some cases, it is better to use the centered parameterization. But you can’t tell which just from looking at characteristics of the data, so you sometimes have to try both. Start with non-centered though.

Is that really want you want? My strong suspicion is that people who are used to writing JAGS or BUGS models and are not themselves interested in writing Stan programs are not going to want to read your basic multilevel Stan programs. In that case, you should probably be using stan_glmer in the rstanarm R package or brm in the brms R package. Either of those are usually going to be faster than the Stan programs of what most people can write by hand and the likelihoods are understandable by anyone who is familiar with lme4 syntax, which is a larger group of people (and almost a superset) than people who understand BUGS or Stan syntax.

If you really want BUGS people to be reading your Stan programs, then you have to be prepared for them to object to the increased structure of the Stan language, the inability to declare discrete unknowns in the parameters block, the fact that the customary (i.e. conjugate) BUGS priors typically yield inferior performance in Stan and don’t make much sense in the first place, and the warnings that Stan generates during and after sampling that cannot be triggered with BUGS. In addition, they will often lack sufficient appreciation for effective sample sizes (perhaps divided by runtime). These are all talked about at the end of the Stan Manual, but it is a lot to take on for not much upside.

fausto.siegmund · October 28, 2017, 3:12pm

Thank you for your response! I ask about legibility for BUGS users because I am interested in making it possible for readers and reviewers who use BUGS to have the option of understanding (and evaluating) the code I might append to a manuscript or post online. The considerations you list are useful, and I’ve started to explore some of them but had not read Stan for Users of Bugs (Appendix B in the Stan Manual).

Bob_Carpenter · November 2, 2017, 10:38pm

One of the main design goals for the language was to make it easier to understand Stan programs than BUGS or JAGS programs. Your thoughts about typing and declaring data vs. parameters may vary.

JAGS has added vectorization, so that shouldn’t look so different any more.

Topic		Replies	Views
Slow running multilevel longitudinal model containing random effects at two levels Modeling techniques , specification , performance	1	412	January 21, 2021
Understanding Multilevel modelling STAN code Modeling	2	178	April 26, 2024
Question about multilevel logistic model (mixed intercept logistic model) in Stan code Modeling rstan	4	499	June 7, 2021
Brms approach to Stan models for nested versus non-nested multi-level models? brms	11	5916	June 23, 2019
Want to confirm my data is in the correct structure Modeling brms	2	271	January 22, 2024

Best Practices for Writing Multilevel Model in STAN

Related topics