I’ve been digging into ways to parameterize an orthonormal matrix. Here’s a simple approach based on the QR decomposition in which the variable \beta is distributed uniformly over K \times N orthonormal matrices (the N columns are pairwise orthogonal and unit length (i.e., \beta^\top \cdot \beta = \textrm{I}).
The approach is straightforward. Generate a K \times N matrix of independent standard normal variates, QR factor it, and the resulting Q matrix is your uniformly distributed orthonormal matrix.
ortho-qr.stan
data {
int<lower=1> K; // dim of orthogonal vecs
int<lower=1, upper=K> N; // # orthogonal vecs
}
parameters {
matrix[K, N] alpha;
}
transformed parameters {
matrix[K, N] beta = qr_thin_Q(alpha); // uniform orthonormal
}
model {
to_vector(alpha) ~ normal(0, 1);
}
generated quantities {
matrix[N, N] I = beta' * beta; // identity
}
The generated quantities block is just a check on orthornormality, which we can see passes by running in CmdStanPy:
import cmdstanpy as csp
model = csp.CmdStanModel(stan_file = 'ortho-qr.stan')
fit = model.sample(data = {'N': 3, 'K': 5})
fit.summary()
The output verifies the orthonormality:
Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat
lp__ -7.530620 0.072056 2.790200e+00 -1.272850e+01 -7.157700 -3.645840e+00 1499.44 6021.85 1.001490
alpha[1,1] 0.001198 0.011320 9.979420e-01 -1.625390e+00 -0.022932 1.667450e+00 7771.42 31210.50 0.999248
alpha[1,2] -0.013111 0.010082 1.022400e+00 -1.696020e+00 -0.022433 1.638130e+00 10284.20 41302.00 0.999890
alpha[1,3] 0.008322 0.010008 9.599100e-01 -1.543770e+00 0.019685 1.580010e+00 9200.09 36948.10 0.999080
alpha[2,1] 0.006395 0.011304 9.983870e-01 -1.629180e+00 0.002607 1.669930e+00 7800.70 31328.10 0.999640
alpha[2,2] -0.000308 0.011192 1.042710e+00 -1.737230e+00 -0.010869 1.695580e+00 8679.69 34858.20 0.999508
alpha[2,3] 0.000019 0.010190 1.007820e+00 -1.663040e+00 0.010653 1.630600e+00 9782.29 39286.30 0.999385
alpha[3,1] -0.007449 0.010033 1.002010e+00 -1.656830e+00 -0.029778 1.664730e+00 9974.33 40057.60 0.999275
alpha[3,2] 0.008401 0.010177 9.883600e-01 -1.615680e+00 0.003338 1.640160e+00 9431.65 37878.10 0.999213
alpha[3,3] -0.005065 0.009752 9.922030e-01 -1.642460e+00 0.003125 1.620050e+00 10351.10 41570.50 0.999351
alpha[4,1] -0.004184 0.010482 9.996200e-01 -1.645750e+00 -0.007468 1.662770e+00 9095.02 36526.20 0.999319
alpha[4,2] 0.000769 0.009982 1.000090e+00 -1.690680e+00 -0.003663 1.641520e+00 10037.90 40313.00 0.999094
alpha[4,3] 0.000399 0.010426 9.956700e-01 -1.655840e+00 -0.006336 1.657750e+00 9120.13 36627.00 0.999352
alpha[5,1] -0.007562 0.010794 1.025450e+00 -1.699320e+00 -0.007088 1.698940e+00 9025.03 36245.10 0.999235
alpha[5,2] 0.006172 0.010159 1.007470e+00 -1.659660e+00 0.012777 1.659950e+00 9834.58 39496.30 0.999584
alpha[5,3] 0.003730 0.010721 9.896660e-01 -1.637200e+00 0.002523 1.617940e+00 8521.89 34224.40 0.999434
beta[1,1] -0.003341 0.005048 4.434700e-01 -7.390780e-01 -0.011343 7.318480e-01 7718.13 30996.50 0.999193
beta[1,2] -0.006083 0.004530 4.551550e-01 -7.330580e-01 -0.015444 7.420130e-01 10093.40 40535.90 1.000340
beta[1,3] 0.001120 0.005151 4.476690e-01 -7.311930e-01 0.000670 7.415760e-01 7553.58 30335.70 0.999357
beta[2,1] 0.006547 0.004872 4.449230e-01 -7.086050e-01 0.001480 7.245850e-01 8338.67 33488.60 0.999627
beta[2,2] 0.004741 0.005034 4.523190e-01 -7.313870e-01 0.013392 7.362360e-01 8073.45 32423.50 0.999309
beta[2,3] 0.000025 0.005196 4.499990e-01 -7.325200e-01 0.000652 7.365650e-01 7501.13 30125.00 0.999172
beta[3,1] -0.001154 0.004481 4.483140e-01 -7.332880e-01 -0.015290 7.465450e-01 10009.50 40198.80 0.999285
beta[3,2] 0.004010 0.004769 4.413500e-01 -7.192240e-01 0.003180 7.160750e-01 8566.30 34402.80 0.999191
beta[3,3] 0.004710 0.005236 4.452560e-01 -7.361150e-01 0.014167 7.171020e-01 7231.05 29040.40 0.999591
beta[4,1] -0.003018 0.004727 4.440650e-01 -7.174590e-01 -0.003262 7.354320e-01 8825.57 35444.10 0.999422
beta[4,2] 0.000092 0.004798 4.421800e-01 -7.366790e-01 -0.002647 7.219420e-01 8493.95 34112.30 0.999272
beta[4,3] 0.001037 0.005345 4.522030e-01 -7.334000e-01 0.003401 7.324580e-01 7156.91 28742.60 0.999687
beta[5,1] -0.003491 0.004682 4.553810e-01 -7.452820e-01 -0.002727 7.430380e-01 9459.52 37990.00 0.999317
beta[5,2] -0.002081 0.004678 4.450820e-01 -7.287510e-01 0.003087 7.175080e-01 9052.62 36355.90 0.999960
beta[5,3] -0.000743 0.005355 4.411090e-01 -7.207330e-01 -0.002116 7.162440e-01 6785.59 27251.40 0.999485
I[1,1] 1.000000 NaN 6.000000e-16 1.000000e+00 1.000000 1.000000e+00 NaN NaN NaN
I[1,2] -0.000000 0.000000 0.000000e+00 -1.000000e-16 0.000000 1.000000e-16 3865.72 15525.00 0.999975
I[1,3] 0.000000 0.000000 0.000000e+00 -1.000000e-16 0.000000 1.000000e-16 3832.17 15390.20 1.000350
I[2,1] -0.000000 0.000000 0.000000e+00 -1.000000e-16 0.000000 1.000000e-16 3865.72 15525.00 0.999975
I[2,2] 1.000000 NaN 6.000000e-16 1.000000e+00 1.000000 1.000000e+00 NaN NaN NaN
I[2,3] -0.000000 0.000000 1.000000e-16 -1.000000e-16 0.000000 1.000000e-16 3792.03 15229.00 1.000110
I[3,1] 0.000000 0.000000 0.000000e+00 -1.000000e-16 0.000000 1.000000e-16 3832.17 15390.20 1.000350
I[3,2] -0.000000 0.000000 1.000000e-16 -1.000000e-16 0.000000 1.000000e-16 3792.03 15229.00 1.000110
I[3,3] 1.000000 NaN 6.000000e-16 1.000000e+00 1.000000 1.000000e+00 NaN NaN NaN
I’m just getting started on this, so I’d be happy to get suggestions on alternative or better ways to generate uniform distributions over the Stiefel manifold (the set of orthonormal matrices). The QR is supposedly more stable than just applying Gram-Schmidt directly. There’s also the option to use Givens rotations as @Arya et al. did.
Arya A. Pourzanjani, Richard M. Jiang, Brian Mitchell, Paul J. Atzberger, and Linda R. Petzold. 2020. Bayesian Inference over the Stiefel Manifold via the Givens Representation. Bayesian Analysis.