PyStan license


What’s the specific problem with GPLv3 for you? Would LGPLv3 or MPLv2 be
less of a problem?


I ask about academia vs. industry because I’m curious if there are any
people working in academia who need PyStan to be permissively licensed.


I also work in industry.

(At this point we’ve gone ahead with the original plan: wrapper code around CmdStan. It requires saving and reading files inside a Docker container; I’m given to understand that’s slow but I don’t think it’s a bottleneck right now. I’d still be happy to have a permissive python interface though; it could very well come in handy at some unpredictable point in the future.)


I used to be in academia, then did a 15-year stint in industry, then landed back in academia. I do not want to favor academia and/or discriminate against industry. For one thing, academia is a big business! We charge $50K/year tuition for our “product” (I’m guessing U. Indiana charges less!). How about charging private universities, too? They already get roughly 80% of the grant money we raise (we raise $2.06 for every $1.00 in salary we can pay, but that includes fringe, where fringe costs are less than we raise). So we know they have a few $M floating around to pay for licenses.

I don’t think the Stan project should be releasing Python interfaces under GPL. I think it should be licensed as permissively as possible to garner the widest possible usage.

If Allen insists on keeping the existing PyStan 3 under GPL of some form, I will lobby for writing a new Python interface with a BSD license. We can’t force Allen to relicense PyStan as it is for two reasons: he (or most likely Dartmouth and U. Inidana) own(s) the copyright to his code. We don’t even have appropriate governance structure in place to have something like a vote on whether to allow a PyStan 3 going forward with a BSD license.

The individual licensing isn’t a problem with a more permissive license, as the license gives a user all the rights they need. But GPL and copyleft is far more sticky—you can’t then make something that’s not GPL-ed (that’s the whole point).

You’re right—I/O is not going to be the bottleneck in all but large-scale linear regression MLEs, where we can fit faster than we can do our current slow I/O. Your approach has the advantage of not requiring storing the draws in memory as they are being produced. Storing the draws turns out to be a substantial bottleneck for RStan, which I believe requires twice the eventual amount of memory needed for the draws. I don’t know about RStan.

I’m afraid our hands are tied with RStan in the sense that it links directly to R and thus is required to be GPL-ed (and even if we could BSD the RStan code, the combined product of R plus RStan is going to be copyleft anyway due to R’s copyleft).


P.S. I have talked to Allen about this off list. He’s OK with there being more than one Python interface to Stan.


I’m definitely on board with switching away from the GPL. Initially I
had planned to use a “partial” copyleft license such as MPL (or LGPL).
You can combine MPL licensed code with a commercial product. ZeroMQ
(used by Jupyter and many others) uses such a license.

I’ll likely use BSD2 (or ISC) for PyStan 3 because Bob and some people
outside the academic and non-profit sector seem to care deeply about this.

It would be nice to have a broader survey of people’s views.

Also – I think having a competitor to PyStan would be great since that
would mean at least one (maybe more) new developers working in the Stan


These licenses are not at all about commercial vs. non-commercial. I think they’re often assumed to be that because commercial software houses tend to shy away from GPL-ed code becuase they don’t want to release their own code under the GPL.

Nothing prevents linking to GPL code in commercial software products that redistribute the GPL-ed code. It just requires the commercial product to be released under copyleft. My Blu-ray player runs on GPL code and includes a copy of the GPL!

If users are not distributing code, then they can do whatever they want with GPL-ed code in-house. Furthermore, there’s nothing preventing someone from building a server on top of GPL-ed code and selling access to the server without distributing their source. That doesn’t violate the GPL either as long as they’re not redistributing the GPL-ed code. This use case motivated the even-stronger copyleft of the AGPL.

Eigen’s released under MPLv2. It was originally LGPL but they went through the laborious process of changing it. I’m not sure why.

So whever you get Stan, you’re getting something with an MPL component.

My concern isn’t so much avoiding any particular license, but in making it easy for everyone to use Stan.

What do you see as the advantage of something like MPL over BSD?


That would be awesome!

While it’d be great to have more developers working on Python plus Stan, I think it’d be more productive if they worked together, unless there were two radically different designs that were arguably both useful given the existence of the other.


Bob’s use case for just releasing under BSD is clear: there are people in industry stuck with industry lawyers who don’t want to deal with the complexity of using GPL code so they just ban GPL code (even if there aren’t any real implications to using the GPL code internally). We clearly can do this as Bob suggests so what are your reasons for not wanting to release under BSD (let’s say in parallel to GPL)?

I’m all for rebuilding PyStan/rstan from scratch once we move more things into core services code and have a shared binary format for interfaces and maybe some shared config code and shared writers. Practically I don’t know what sort of timeframe this can happen on so I’m going to continue working on IO and the math lib till then. I’d rather have it be a joint effort because we don’t have the people power to go off making multiple interfaces and having them come out as solid maintainable code.


Hi Allen, Bob, and others. I was wondering, has there been any new updates on this since September? I can’t use or distribute any GPL dependencies in the projects I’m working on but obviously would love to use and support this project.


I haven’t been following progress with PyStan3, so I don’t know. @ariddell will know.


So you’re asking when PyStan 3 will be released? I think it’ll get done
by the end of summer 2018.