# QR decomposition inconsistent with QR decomposition in R

I’ve been having some problems with QR decomposition in Stan and was trying to debug to understand the differences between the fat QR decomposition in Stan and the thin version in R.

I have a 16 \times 16 matrix of rank 14. Computing the decomposition of this with the algorithm in Stan and in R gives different results for the Q matrix. I want to get sparse vectors that will span the null space of the matrix. The final two columns of the Q matrix should span this null space, and I assume this should work for both the fat and thin version of the algorithm. But if I look at span of the final two columns from both algorithms together, this has rank 3, whereas if the results were equivalent I would expect it to be rank 2.

What am I missing here? Apologies in advance if this is a stupid question or posted in the wrong place.

library(rstan)
expose_stan_functions('debug.stan')
#This constructs a matrix B to illustrate the behaviour. B should be a 16x16 matrix with rank 14
B = t(alter_matrix(construct_matrix(0),c(1,9)))
Q = qr.Q(qr(B)) #qr decomposition with R
Q_stan = qr_Q_stan(B) #qr decomposition with Stan
apply(Q-Q_stan,2,function(x) sum(abs(x))<10^-14) #several of the final columns are different
#the final two columns of the Q matrix should span the null space of B. If both span the same space, then rank of this should be 2, but gives rank 3.
X = cbind(Q[,15:16],Q_stan[,15:16])
print(qr(X)\$rank)


debug.stan (1.5 KB)

If you get the \mathbf{R} matrix from Stan, its diagonal is

1.732051
1.994785
1.466913
1.524785
0.9063695
0.9795861
0.9450767
0.9547696
0
0.3301462
0.2759354
0.2750403
0.1516811
0.1388168
0.0723701
0


My guess is that it is just numerical differences when trying to decompose an ill-conditioned matrix. Neither Stan (Eigen) nor R (LAPACK) are using pivoting, so the rank calculations are not too reliable. The only intentional difference is that Stan makes the diagonal elements of \mathbf{R} non-negative.

Thanks, that does help. I should be able to identify the columns corresponding to the zeros and use those. I think the R LINPACK implementation does use pivoting which is why if I look at the equivalent diagonal of \mathbf{R} for that the zeros are at the end of the vector.