Simplex transform an hyperspherical cooridnates

Dear all,

I have a very simple and possibly dumb question. In the stan documentation you provide a stick breaking construction of the unconstrained simplex. This solution is based on the logit function.

In the footnote you cite @betanalpha paper on “cruising the simplex”. Now, in that paper everything is in hyperspherical coordinates, but it is claimed to be equivalent to the logit construction.

I am sad to admit I cannot see how both constructions are equivalent, can someone please help me figure this out?

Thank you and best regards.

Equivalent in the sense that they both lead to uniform distributed simplexes.

@mjhajharia is working on a paper that shows a bunch of different simplex transforms. In fact, there are infinitely many.

1 Like

We should make that clearer in the doc. I created a doc issue to fix it:

Cool, I am interested, pleas @mjhajharia let me know when this is out.

The key result connecting the two is the first equation on page 2, second column. That is literally stick breaking for the intermediate variables z_i = \sin^{2} \theta_i; that said technically the z_i are the complementary proportions used in the stick breaking, so 1 - z_i is the proportion allocated to the $i$th element of the simplex and z_i is the proportion that remains. One can equivalently take v_i = 1 - z_i.

The z_i are constrained to the unit interval but taking a logit completely unconstrains them,

w_i = \text{logit} z_i = \text{logit} ( \sin^{2} (\theta_i) ) \in (-\infty, \infty).

Working the other way one can start with the I - 1 unconstrained, real-valued variables w_i and then construct the z_i,

z_i = \text{logistic} \, w_i

and then use those as incremental complementary proportions,

x_i = (1 - z_i) \prod_{i' = 1}^{i - 1} z_i'.

Equivalently one can add use incremental proportions directly with

v_i = 1 - \text{logistic} \, w_i

and then use those as incremental complementary proportions,

x_i = v_i \prod_{i' = 1}^{i - 1} (1 - v_i').

Note that the “Cruising the Simplex” paper works with the hyperspherical parameters directly and doesn’t actually unconstrain them. The freedom in how to unconstrain them can be used to generate a transformation exactly equivalent to what is used in Stan or something complementary to what is used in Stan. This is also assuming that the ordering of the variables is fixed which is another degree of freedom in both methods.

I’ve mentioned this a few times in other threads but let me mention again that the freedom in constraining/unconstraining transformations is a bit overblown. Any two bijective transformations between a constrained space X and an unconstrained space Y are related by another bijection on Y (i.e. a reparameterization of Y). More formally if \phi_1 : X \rightarrow \mathbb{R} and \phi_2 : X \rightarrow \mathbb{R} then \phi_1 = \psi \circ \phi_2 where \psi : \mathbb{R} \rightarrow \mathbb{R}.

The question of which transformation is “best” is equivalent to asking which parameterization of the unconstrained space is “best” which is depends on the target distribution and is practically intractable.

1 Like