Sorry for the confusion.
First, I was just verifying that we’d need external GPU libraries, too. Those external libs can be a huge pain in R, which won’t let us bundle them with RStan due to size unless they’re very small. Having multiple such external libs makes installation painful, and it’s already a big enough pain point with Stan.
For 32-bit vs. 64-bit, Tensorflow is concentrating primarily on 32-bit arithmetic from what I can see of Edward and elsewhere and what I heard from asking around. Mxnet is concentrating primarily on 64-bit stuff, which is why it’s relevant, as we want to be able to do things like sparse Cholesky factorization efficiently which is challenging if not impossible with single precision.
We’re only talking to the mxnet folks so far about adding sparse matrix functionality. It sounds like it’ll be orthogonal to whatever we do with you guys. But if they use CUDA and you use OpenCL, it’ll add yet another dependency and probably restrict us to using either sparse or dense operations and not mixing them if that’s even possible.
Everyone keeps telling us all these dependencies are simple, but they’ve proven to be a huge pain for us to manage through R and Python. I don’t know that we’ll even try to get GPUs working through anythnig other than our CmdStan interface on Linux.
Hopefully everyone like you who knows more about this than me will be in on any decision to consolidate efforts, but that’s a long way off.
No promises on any long-term support. We just don’t have the staff to make those kinds of commitments. You’re going to be the expert in Stan math and GPU code, so I don’t know what you’re expecting from the other Stan devs here We will continue to answer questions about the math lib for everyone.