Hey, I just fit a model and got the following warning message:
1: There were 18 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
2: Examine the pairs() plot to diagnose sampling problems
I have no problem with the warning message–if my chains have divergent transitions, I’d like to know. I’m not really sure if 18 out of 4000 is a little or a lot, so it could be useful to have guidance on that. But that’s not my main concern here.
Also I like the pointer to the documentation–that’s great–and it makes sense to suggest the pairs() plot.
The thing that’s bothering me is the suggestion, “Increasing adapt_delta above 0.8 may help.” I’m kinda worried that this is encouraging people to sweep problems under the rug, also that it pushes people toward a slower version of Stan where, to be safe, they set adapt_delta to 0.999 or whatever.
Maybe the warning message could say something like, “Your model may be poorly identified. Consider using stronger priors.”
Or something like that? I’m not sure of the exact wording. I just know that in practice it can help to use stronger priors, and typically this prior info is available.
Not always–I recognize that sometimes you’re trying to fit the model you want to fit, and it’s just weakly identified, and you want Stan to explore the damn posterior distribution–but often your computational problems can be fixed with just a bit of regularization.
It would be good if the warning message were to say this. Alternatively, the warning message could not say this, and it could also not mention adapt_delta at all. It could just point to the documentation page which would have all this discussion. But I don’t think it’s good that right now we privilege the adapt_delta suggestion.