Robust Stan Workflows

For those new to Stan (or perhaps even not!) I’ve put together some case studies on the robust use of Stan with a focus on checking and interpreting the diagnostics that indicate when Hamiltonian Monte Carlo is failing to achieve an accurate fit. I highly encourage everyone to adopt a similar workflow to ensure the robustness of your analyses.


Nice! One small suggestion: It’d be nice if in the “Checking the E-BFMI” section you linked to the theoretical work it is based on…

Do you actually think anyone will read it? People have told me that that one is particularly off-putting, despite the simplicity once you get past the formal derivation. Actually there’s a more introductory description in the conceptual review, although again not many people were motivated to go that far.

I like the guidance that you have in the divergent transition section for locating and fixing the problem. Do you have any tips you could add to the E-BMFI section?

I think that they’ll want to read it, but they’ll want to know to exists
and which one it is.

Unfortunately not yet. Theoretically it depends on the density of states which quantifies how the kinetic and potential energy interact, but the density of states is nigh impossible to calculate outside of simple cases and so that interaction is even harder to understand. Empirically, low E-BFMI can manifest in cases where the posterior is extremely heavy tailed (such as centered parameterizations of hierarchical models, at least in the unconstrained space) but it’s not clear how that would be remedied. Lighter tailed kinetic energies might help, but that would have other adverse effects. In the end it’s easier to switch to the non-centered parameterization, but that’s only applicable here.

As I noted in the case study, we’ll have to wait until we get more empirical data to help guide the theory a bit. Hopefully that will come from more users keeping an eye on this diagnostic.

I reference the conceptual paper (which discusses E-BFMI and references the formal paper) in the introduction of the case study for further information about HMC. Would you suggest making this more explicit or reinforcing the reference by repeating it in the diagnostic section?

Maybe reference the paper section?

Sorry to be dense – reference the sections in the introduction where the current reference is, or in with the sections where each diagnostic is utilized? Happy to do either, just want to get a feel for what is preferred. Thanks!

In the section that mentions the diagnostic is based on recent theory,
reference the section of the conceptual introduction that conceptually
introduces it.

Got it – case studies updated. Thanks!

I cannot find the stan_utility.R file.

stan_utility.R, but it might be easier to download the zip file, from here.

Links to all materials (for all case studies, not just these two) are included in the case study descriptions on the main case study page,