I have a 64-core server, so I ran a fit with 60 chains. One chain behaved very badly (was off by 2 orders of magnitude - which may be a hint that we should have been fitting a log-scaled parameter), and my rhat’s were pretty bad. However, the other 59 chains converged.
When I compared the model predictions to the data, they actually did a pretty good job.
So my question is: is there ever a case where it is ok to ignore one bad chain? Or is the best practice to fix the problem (e.g. run longer warmup / rescale parameters) and re-fit?