I was experimenting with the QR reparameterized linear regression, based on the case-studie. I was experimenting with increasing the size of the synthesized dataset. And got a bad_alloc
on the line qr_Q(X')[, 1:M]
when the problem was still small as compared to my ram (N = 30,000 samples, M = 150 parameters.) What is going on? Is this line allocating for a NxN matrix?
Yeah, qr_Q
and qr_R
in Stan Math do a fat QR decomposition, which can cause memory exhaustion for large matrices. It is better to do the QR decomposition in R — which does a thin QR decomposition — and pass in the Q and R matrices.
Ok thanks! I will try that.
Any chance we could add the thin QR to Stan?
Or add a note to do it in the surrounding code to the documentation/case-studies?
It is in the Stan User Manual, just search for fat QR. Unless Eigen implements a thin QR decomposition, there is not much chance that Stan will have one.
Ok, that make sense. As to documentation, I just rechecked the stan-reference-2.17.0.pdf and it does not mention computing the QR outside of Stan. Perhaps, a line in “9.2. The QR Reparameterization” explaining that the “Stan implementation is memory hungry so consider pre computing Q and R. Then using them as data.” I think it would have made things clearer to me.
Or if it doesn’t belong in the reference, than perhaps another example at the end of the wonderfull case-studie showing how it can be done. It already goes thru 3 versions of this model, walking thru the advantages of each, maybe one more to show how to scale it to real sized problems?