Non-identification, local maxima and optimization

Unfortunately, there are no general methods - every problematic model is problematic in its own ways and in the end you just need to gain a good understanding of your model and its implications. Some of the suggestions at Divergent transitions - a primer apply to this situation as well. In particular, I would try to find the smallest model that is still problematic and the most complex model that doesn’t have the problems (on data simulated to match the model). Then think hard about what changed between those two models could provide some hints.

Unfortunately, for some of the ODE models people want to fit we couldn’t find a parametrization that would work :-/