I used to be in academia, then did a 15-year stint in industry, then landed back in academia. I do not want to favor academia and/or discriminate against industry. For one thing, academia is a big business! We charge $50K/year tuition for our “product” (I’m guessing U. Indiana charges less!). How about charging private universities, too? They already get roughly 80% of the grant money we raise (we raise $2.06 for every $1.00 in salary we can pay, but that includes fringe, where fringe costs are less than we raise). So we know they have a few $M floating around to pay for licenses.
I don’t think the Stan project should be releasing Python interfaces under GPL. I think it should be licensed as permissively as possible to garner the widest possible usage.
If Allen insists on keeping the existing PyStan 3 under GPL of some form, I will lobby for writing a new Python interface with a BSD license. We can’t force Allen to relicense PyStan as it is for two reasons: he (or most likely Dartmouth and U. Inidana) own(s) the copyright to his code. We don’t even have appropriate governance structure in place to have something like a vote on whether to allow a PyStan 3 going forward with a BSD license.
The individual licensing isn’t a problem with a more permissive license, as the license gives a user all the rights they need. But GPL and copyleft is far more sticky—you can’t then make something that’s not GPL-ed (that’s the whole point).
You’re right—I/O is not going to be the bottleneck in all but large-scale linear regression MLEs, where we can fit faster than we can do our current slow I/O. Your approach has the advantage of not requiring storing the draws in memory as they are being produced. Storing the draws turns out to be a substantial bottleneck for RStan, which I believe requires twice the eventual amount of memory needed for the draws. I don’t know about RStan.
I’m afraid our hands are tied with RStan in the sense that it links directly to R and thus is required to be GPL-ed (and even if we could BSD the RStan code, the combined product of R plus RStan is going to be copyleft anyway due to R’s copyleft).