Our current optimizer doesn’t converge to a global optimum. This is easily verified by taking a model for which the posterior is non convex (or non-unimodal or non-quasi-convex), for example a Gaussian Process regression, and running the optimizer and observing that it will converge to different solutions.
I’m wondering if would be useful to have a convex conjugate function. The definition I’m using is from Boyd and Vandenberghe, Convex optimization. If f(\cdot|\theta) is the posterior with respect to, or a function of, parameters \theta, the convex conjugate, f_*(\cdot|\theta) is:
You mean the L-BFGS used in the service modes for optimization or something else?
That wasn’t enough of a descirption for me to understand it—I don’t have a strong background in geometry or optimization. Is there a reference to this technique somewhere that I could read up on?
Thanks for the comments. I’ll draft a more detailed proposal. A good reference is the Convex Optimization, Boyd and Vandenberghe. 3.3 mentions the conjugate, but it’s referenced elsewhere in the textbook. Theory is in the earlier chapters.
I was incorrect in saying it’s the convex hull. It’s an approximation.
Forget this. This will be irrelevant once Charles finishes Laplace approximation. If there’s still a need for this after, I’ll hop in after my optimization chops are better after my masters. Thanks for the input and cooperation, as always, bob. I have a better idea, that doesn’t require any math skills that I’ll post soon.