Global Optimization via the Convex Conjugate

Our current optimizer doesn’t converge to a global optimum. This is easily verified by taking a model for which the posterior is non convex (or non-unimodal or non-quasi-convex), for example a Gaussian Process regression, and running the optimizer and observing that it will converge to different solutions.

I’m wondering if would be useful to have a convex conjugate function. The definition I’m using is from Boyd and Vandenberghe, Convex optimization. If f(\cdot|\theta) is the posterior with respect to, or a function of, parameters \theta, the convex conjugate, f_*(\cdot|\theta) is:

f_*(\cdot | \theta_*) = sup_{\theta_i \in dom(\theta_i)}(\theta_*^T \theta - f(\cdot | \theta))

Intuitively/geometrically, this shape is the convex hull of the joint posterior for parameters \theta.

Thoughts? Could be used elsewhere in library, for example, any deterministic algorithm that uses optimization.

You mean the L-BFGS used in the service modes for optimization or something else?

That wasn’t enough of a descirption for me to understand it—I don’t have a strong background in geometry or optimization. Is there a reference to this technique somewhere that I could read up on?

Thanks for the comments. I’ll draft a more detailed proposal. A good reference is the Convex Optimization, Boyd and Vandenberghe. 3.3 mentions the conjugate, but it’s referenced elsewhere in the textbook. Theory is in the earlier chapters.

I was incorrect in saying it’s the convex hull. It’s an approximation.

Thanks for the reference—there’s an open-access version from the author.

Forget this. This will be irrelevant once Charles finishes Laplace approximation. If there’s still a need for this after, I’ll hop in after my optimization chops are better after my masters. Thanks for the input and cooperation, as always, bob. I have a better idea, that doesn’t require any math skills that I’ll post soon.