New Transform for Orthonormal Matrices in Stan


Also, P has a determinant of 1 by construction, so this excludes the orthogonal matrices that have a determinant of -1. If you include those, then you have to deal with the 2^p possible reflections.


Hm, did you flip a sign somewhere? Shouldn’t it be


with \kappa having positive sign?

edit: also odd you get difficult geometry for \kappa=0 since that should be the uniform case, no?


I was trying to do it the way we would do it in Stan if this were implemented as a type, based on the equation before equation (11) in the paper that is a PDF for the skew-symmetric matrix. Since it is in the denominator, the whole exponent gets negated when you take the logarithm.

I think the reason it is weird when kappa = 0 is because there is a multivariate t underlying it with only p degrees of freedom.


Ah, right, I was looking at the density over the orthogonal matrices, but I agree that the density over s is a more suitable target. But annoying that it is weak at sampling from the uniform distribution in particular. Does the student-t relationship indicate that it is sampling from a degenerate space, or is it only near-degenerate?


I don’t think it is degenerate; it is just heavy tailed. It would probably be okay if the data were even slightly cooperative.


Got around to trying it in a factor model - seems to mix well enough, even with \kappa=0. I do get quite low BFMI, but I suspect this is due to rotational invariance inherent to factor analysis models.

edit:There is also the issue of only needing the first few vectors of the full rotation matrix - this likely introduces additional ridge geometries.

edit: I was wondering whether we could just sample orthogonal vectors sequentially. Sampling one direction x_1, we can construct the orthogonal projection matrix \bar{P}_1 and apply it to the next sampled direction \bar{P}_1\tilde{x}_2. As above, there is the issue that the effective distribution over \tilde{x}_2 will be degenerate, as the part parallel to x_1 will be cancelled out. Any way to reparameterize a degenerate Gaussian?


You can just do it directly in terms of the semi-definite covariance matrix (though we don’t support that in multi-normal directly, you can code it yourself) if you know you will only be feeding it things meeting the constraints. But ideally, you’d work out the marginals on the free dimensions and then make the rest transformed parameters.