Understanding non-bijective variable transformations

Generally there is not a(n immediate) practical solution.

Firstly we need to be more precise. Manifolds are very general spaces and cover every possible application of Hamiltonian Monte Carlo; in particular all real spaces are manifolds.

Consequently there are a few ways to interpret the problem:

  1. A user wants to define a model over a manifold not isomorphic to some real space.
  2. A user wants to define a model over a space implicitly defined by some real space \mathbb{R}^{N} and a sufficiently nice constraint function c: \mathbb{R}^{N} \rightarrow \mathbb{R}. For example the implicit space is defined by the kernel of the constraint function, c^{-1}(0). That implicit space might be non-Euclidean or not.

(1) is not possible directly.

Recall that Stan provides implementation of Hamiltonian Monte Carlo for real spaces. In particular Stan requires that the target distribution is defined by a probability density function, which requires global coordinates that span the entire ambient space. This immediately restricts the application of Stan to certain topological spaces – spheres, torii, Stiefel manifolds, and the like are all inadmissible because of their topological properties ( none of these spaces admit global coordinates, i.e. they are not isomorphic to \mathbb{R}^{N} for any N).

Because of its geometric foundations Hamiltonian Monte Carlo can be implemented on any smooth manifold, including those not isomorphic to a real space. The implementations are more complicated (building symplectic integrators over these spaces requires being able to calculate evolution along geodesic curves, states have to be saved in local charts, etc), but perhaps a bigger limitation is that valid expectation values, and hence natural posterior summaries, become much more subtle (no natural moments, cumulants, quantiles). For these reasons and more Stan has maintained that limited scope to real ambient spaces.

What is possible?

Well if the target distribution strongly concentrates in some subset of the ambient space then it might be well-approximated by a target distribution that is exactly zero outside of that subset. If that patch of the ambient space is isomorphic to some real space then the approximate target distribution can be implemented in Stan. In differential geometric language this means that the target distribution concentrates within a single chart, so that the model can be implemented within that one chart (the chart being the formal name for those “almost bijections” mentioned in a few places in the thread).

Alternatively it might be possible to embed the ambient space into a higher-dimensional space that is isomorphic to a real space (mathematically this is always possible, but it might not be obvious how to do it in practice). If the model can be lifted to that higher-dimensional real space then it can be implemented there directly. This is how the unit vector type is implemented in Stan.

(2) is almost the opposite problem where we start on a high-dimensional space.

While not implemented in Stan there are Hamiltonian Monte Carlo implementations that can be defined for embedded spaces implicitly defined by a constraint function; see for example Markov Chain Monte Carlo on Constrained Spaces which uses the RATTLE symplectic integrator to generate transitions confined to the embedded space. The practical performance of these methods, however, will depend on the complexity of the embedded space.

If the implicit space is isomorphic to a real space then it will be possible to parameterize the embedded space directly. If the model is defined over the (higher-dimensional) embedding space, however, then one has to figure out how to project the higher-dimensional density function down to a lower-dimensional density function which goes back to those nasty integrals I mentioned above.

If the implicit space is not isomorphic to a real space then we’re back to the problems introduced in (1).

3 Likes