Who wrote the QR section in the manual? I’m guessing either Ben or Jonah?
The manual suggests scaling the Q and R matrices by sqrt(N - 1). This approximately makes Q orthonormal, but for unit scaling don’t we want to scale by the full N? This also seems to be the case empirically as demonstrated in the case study.
Any thoughts on what’s causing the correlations in the transformed slopes? I thought it was the weakly informative prior on the nominal slopes, but I can’t seem to recover an isotropic posterior even with a uniform prior on the slopes. This problem should be simple enough that an isotropic posterior is achievable, no?
(1) Me
(2) My thinking was that if Q* = Q * sqrt(N- 1) then the correlation matrix of Q* is the identity matrix. So, the units of the coefficients on Q* would be in standard deviations. I don’t think that helps all that much in terms of formulating a prior on the coefficients with respect to Q* or X though. Scaling by N is another option. Not scaling at all seems to be a bad idea for large N.
(3) Did you center both columns of X before decomposing it?
One thing that needs to be said, which I didn’t say in the manual is that under the QR decomposition, the last coefficient on Q or Q* is proportional to the last coefficient on X. So, if you only care / have informative prior information about one coefficient, it should be put last in X and then rescale your prior accordingly.
Also, I have since come to having the hunch that a polar decomposition would be better than a QR decomposition.