Huge contrast in sampling time among chains for the same model?

I had that happen with a large gaussian process model and the default adapt_delta=0.8 parameter, the biggest problem being that the posteriors did not to mix properly. And they would take from 1 to 10 days.
From the discussion here I gathered that the value was too low and therefore the chains were not equally tuned after the burn-in period (or something along those lines, I’m not familiar with all the details of the NUTS tuning). Maybe you can check if the different chains are mixing as expected and the only difference in the chains is the time it takes.

I increased the value to adapt_delta=0.9 and mixing got better (may still need to increase it to 0.99 and/or run a longer chain) and the variation in between-chain elapsed times decrease to a range of something like 2-3 days (although I ended up changing other parts of the model so I can’t compare directly).
Maybe try that first, and if you get elapsed times in between you’ll know in a few hours.