I’ve been fitting a large model using NUTS. I find that the maximum treedepth is often exceeded during the warmup stage, for example, reaching treedepths of 17 and 18. Obviously, this makes running the model exceedingly slow.
Surprisingly, I notice the model doesn’t need a large treedepth when sampling - the treedepth never exceeds 11 during sampling. This suggests to me that the posterior geometry is ‘well behaved’, but it seems that NUTS is having some trouble adapting.
What would be best practice in this case? Perhaps limiting the treedepth to a reasonable during warmup, but not during sampling? Though, this could lead to worse adaptation. I’m not sure what reparameterisations could be helpful for this case.