Some time ago, we had a discussion on updating the text of warnings for divergent transitions, treedepth etc (https://discourse.mc-stan.org/t/text-for-warning-message/), this led me @andrewgelman , @jonah and @avehtari to attempt to update the document that currently sits at https://mc-stan.org/misc/warnings.html (the link currently leads to the old version!) The idea is that this document could then be linked from warning messages in all interfaces and provide a basic overview of what the warnings mean and what one can do about them.
We’d be very happy to get feedback from broader community before making this part of the official docs.
A potential problem is that due to our background a lot of the linked resources are R centered. If you know of good resources using Python, please link them where relevant!
Beyond hopefully helping some users to figure out the problem with their model on their own, the aim is that a) it will be easier for people to find additional resources and b) even if the user is unable to resolve the problem on their own, it will be easier to help them (e.g. here on Discourse), because they’ll provide more relevant information - that’s a big reason why “Simplify your model” is the first hint provided.
As a user in constant need of help, I am hungry for more advice on how to make it easier to help me. The guide to warnings might not be the primary place to find such advice, but since there’s already a section at the end with some best practices, I wonder if this point could be expanded or maybe include links to more detailed advice on getting help.
Specifically, are there existing examples or other guidance on what it really means to “start simple” and slowly add complexity? Maybe a worked-through example of a moderately complicated model that, when implemented in a straightforward/obvious/naïve way leads to some warnings, along with a step-by-step of starting with simple models that leads to the revelation of where these warnings start showing up and how that would lead to a solution.
You might not be used to seeing so many warnings from other software you use, but that does not mean that Stan has more problems than that other software.
Very true.
As the warning message says, you should call pairs() on the resulting object
This is not really practical if you have > 100 parameters for example.
Red points indicate divergent transitions.
One point is not a transition, so what do these points really indicate? If I remember correctly someone said here on the forums that it is a point that is sampled from a trajectory which at some point diverged. I.e. the red point is not the point where the trajectory diverged, and it could have actually been anywhere. So this could be clarified.
In our experience, divergent transitions that occur above the diagonal of the pairs() plot — meaning that the amount of numerical error was above the median over the iterations
I don’t understand this. What diagonal is meant here and how is it connected to numerical error of the trajectories?
iter argument.
There’s no iter in some interfaces, only iter_warmup and iter_sampling. At some point the text seemed to implicitly start assuming that RStan is used.
it is essential that you follow these recommendations:
I would say that following the practices in this “getting help” section may help you but they are not really essential. Starting to use version control for example can be a big hurdle for some more applied users.
In fact between the last draft of this document and now, I’ve written one such example at Small model implementation workflow • SBC - it relies on the SBC package for some functionality (it uses simulation-based calibration to check for bugs/problems), but the core ideas are IMHO accessible even without understanding SBC. Just looking at the sequence of Stan models built there IMHO demonstrates the core principles quite well.
I added the case study as another reference in the document.
This phrase only exists in the current version (at Runtime warnings and convergence problems), but is not found in the proposed new version (at Runtime warnings and convergence problems - HackMD), so I fear you’ve been reviewing the old version - sorry for the confusion. In the new version, we removed most of the discussion of the details of the pairs plot as it is a bit interface specific. Instead we link to relevant documentation in the packages (which hopefully contains enough info for users to find this).
That’s a good point (this wording survived into the new proposed version), I adjusted it.
The old version of the document still references the warnings outputted by stanc2, which at this point only applies to RStan (and hopefully not for too much longer). I noticed the new version is exclusively runtime warnings, which is probably a good distinction.