Throwing out another idea: this stuff is sensitive to starting points and if it mostly works but isn’t very reliable this might be a good target for multi-threading. Start a few solvers (using futures) and use the first one that terminates correctly.
This is what we do in INLA.
Sounds like a pretty neat concept.
Awesome! It’s probably even a thing with a name.
The initial models we should be examining now, should be very stable and there should not be need for multiple starting points. If Newton is unstable, then I guess you are using some other model than what I’m thinking, Newton means different things for you than me, or there is some error in your code…
The models I have are pretty much what is already there, with some minor changes and some code to build a testing framework. I’m also working on gradient descent and accelerated gradient descent. It’s currently on a private repo. But if there is enough interest, I can make the repo public.