Our current optimizer doesn’t converge to a global optimum. This is easily verified by taking a model for which the posterior is non convex (or non-unimodal or non-quasi-convex), for example a Gaussian Process regression, and running the optimizer and observing that it will converge to different solutions.

I’m wondering if would be useful to have a convex conjugate function. The definition I’m using is from Boyd and Vandenberghe, Convex optimization. If f(\cdot|\theta) is the posterior with respect to, or a function of, parameters \theta, the convex conjugate, f_*(\cdot|\theta) is:

f_*(\cdot | \theta_*) = sup_{\theta_i \in dom(\theta_i)}(\theta_*^T \theta - f(\cdot | \theta))

Intuitively/geometrically, this shape is the convex hull of the joint posterior for parameters \theta.

Thoughts? Could be used elsewhere in library, for example, any deterministic algorithm that uses optimization.