Is it safe to ignore max_treedepth warnings if other diagnostics (ESS, Rhat, loo) are acceptable?

I’m attempting to productionize a model. Generally when run, between 3 and 4000 iterations require a treedepth between 13 and 15. Unfortunately, at a max_treedepth of 15, the model fits too slowly to function in production.

Efforts to resolve the treedepth warnings by reparameterizing the model have been unsuccessful.

However, ESS, Rhat, and loo diagnostics are acceptable. On visual inspection, the posteriors of the key parameters do not appear to be different when the model is run to a max_treedepth of 10 vs. 15.

Is it safe to disregard the max_treedepth warnings in this case?

2 Likes

To quote from Stan’s guide on warnings:

Warnings about hitting the maximum treedepth are not as serious as warnings about divergent transitions. While divergent transitions are a validity concern, hitting the maximum treedepth is an efficiency concern. Configuring the No-U-Turn-Sampler (the variant of HMC used by Stan) involves putting a cap on the depth of the trees that it evaluates during each iteration (for details on this see the Hamiltonian Monte Carlo Sampling chapter in the Stan manual). This is controlled through a maximum depth parameter max_treedepth . When the maximum allowed tree depth is reached it indicates that NUTS is terminating prematurely to avoid excessively long execution time.

So if you are happy with how all the other diagnostics look, then I would say you are safe.

4 Likes

The max_treedepth warning occurs when Stan truncates the Hamiltonian trajectories it uses to explore. This truncation does not necessarily affect the accuracy of the sampling, but it can strongly influence the precision. Running with truncated trajectories will decreases the performance of each Hamiltonian transition. This results in much smaller effective sample sizes, and less precise Markov chain Monte Carlo estimators, per unit time.

Assuming that no other diagnostics are indicating problems then if the truncated effective samples are large enough for your application then running with a smaller max_treedepth is okay, but you have to explicitly verify that you’re getting precise enough estimators.

At the same time long trajectories usually indicates that your posteriors are highly degenerate, and if you can identify and moderate the source of the degeneracy then you will be able to resolve the treedepth problems without compromising the performance. See for example Identity Crisis.

1 Like

At the same time long trajectories usually indicates that your posteriors are highly degenerate

You’re right, of course. The underlying problems are that the model doesn’t fit the data very well and is only weakly identified. Not my model…

1 Like