PyStan license

Any word on when the PyStan rewrite will be done? I sure could use a BSD-compatible Python interface for Stan around now…

A significant amount of work has been done. There may be an alpha
version before the end of the month.

2 Likes

Thanks Allen! I really appreciate that.

I strongly second the sentiment would not spend time using pystan under gpl v3 restrictions

Is there still a reason you don’t want to release the current PyStan under BSD-3?

Because several files in PyStan are (transparently) derived from RStan
(e.g., stan_fit.hpp), which were and still are available under the GPL.

I suppose it all could be relicensed under BSD-3 if the version of RStan
from which PyStan took the files were publicly released under a BSD-3
license.

I’ll have a (minimal) working example of PyStan 3 out soon. I’m just
finishing writing the documentation and bringing the tests over. I had
hoped to get this done before the semester started.

If you’re really going to release PyStan 3 ahead of the rest of Stan 3, then that’s a fine way to go and will render this discussion moot.

Did I not forward you the message from the NumFOCUS IP lawyer? I’ll do that now.

This is completely irrelevant. A copyright holder may release their IP under multiple licenses without conflict. For example, we did a GPL release of Stan for JStatSoft before they changed their mind about IP requirements and allowed BSD submissions.

According to the lawyer, that’s not required, as translated code is not considered a derived product. I think the reasoning here is that code is only copyrighted, whereas algorithms require patents.

Hi Bruce,

Would you mind elaborating a bit? I would also be interested in knowing if you work in industry or academia.

Thanks

Corey, could you elaborate a bit on your need for a permissive license a bit? I’d also ask the same question I asked @bhomass – are you in academia or industry?

I used to be in academia. I am now in industry. why does that matter?

What’s the specific problem with GPLv3 for you? Would LGPLv3 or MPLv2 be
less of a problem?

I ask about academia vs. industry because I’m curious if there are any
people working in academia who need PyStan to be permissively licensed.

I also work in industry.

(At this point we’ve gone ahead with the original plan: wrapper code around CmdStan. It requires saving and reading files inside a Docker container; I’m given to understand that’s slow but I don’t think it’s a bottleneck right now. I’d still be happy to have a permissive python interface though; it could very well come in handy at some unpredictable point in the future.)

1 Like

I used to be in academia, then did a 15-year stint in industry, then landed back in academia. I do not want to favor academia and/or discriminate against industry. For one thing, academia is a big business! We charge $50K/year tuition for our “product” (I’m guessing U. Indiana charges less!). How about charging private universities, too? They already get roughly 80% of the grant money we raise (we raise $2.06 for every $1.00 in salary we can pay, but that includes fringe, where fringe costs are less than we raise). So we know they have a few $M floating around to pay for licenses.

I don’t think the Stan project should be releasing Python interfaces under GPL. I think it should be licensed as permissively as possible to garner the widest possible usage.

If Allen insists on keeping the existing PyStan 3 under GPL of some form, I will lobby for writing a new Python interface with a BSD license. We can’t force Allen to relicense PyStan as it is for two reasons: he (or most likely Dartmouth and U. Inidana) own(s) the copyright to his code. We don’t even have appropriate governance structure in place to have something like a vote on whether to allow a PyStan 3 going forward with a BSD license.

The individual licensing isn’t a problem with a more permissive license, as the license gives a user all the rights they need. But GPL and copyleft is far more sticky—you can’t then make something that’s not GPL-ed (that’s the whole point).

You’re right—I/O is not going to be the bottleneck in all but large-scale linear regression MLEs, where we can fit faster than we can do our current slow I/O. Your approach has the advantage of not requiring storing the draws in memory as they are being produced. Storing the draws turns out to be a substantial bottleneck for RStan, which I believe requires twice the eventual amount of memory needed for the draws. I don’t know about RStan.

I’m afraid our hands are tied with RStan in the sense that it links directly to R and thus is required to be GPL-ed (and even if we could BSD the RStan code, the combined product of R plus RStan is going to be copyleft anyway due to R’s copyleft).

1 Like

P.S. I have talked to Allen about this off list. He’s OK with there being more than one Python interface to Stan.

I’m definitely on board with switching away from the GPL. Initially I
had planned to use a “partial” copyleft license such as MPL (or LGPL).
You can combine MPL licensed code with a commercial product. ZeroMQ
(used by Jupyter and many others) uses such a license.

I’ll likely use BSD2 (or ISC) for PyStan 3 because Bob and some people
outside the academic and non-profit sector seem to care deeply about this.

It would be nice to have a broader survey of people’s views.

Also – I think having a competitor to PyStan would be great since that
would mean at least one (maybe more) new developers working in the Stan
ecosystem.

These licenses are not at all about commercial vs. non-commercial. I think they’re often assumed to be that because commercial software houses tend to shy away from GPL-ed code becuase they don’t want to release their own code under the GPL.

Nothing prevents linking to GPL code in commercial software products that redistribute the GPL-ed code. It just requires the commercial product to be released under copyleft. My Blu-ray player runs on GPL code and includes a copy of the GPL!

If users are not distributing code, then they can do whatever they want with GPL-ed code in-house. Furthermore, there’s nothing preventing someone from building a server on top of GPL-ed code and selling access to the server without distributing their source. That doesn’t violate the GPL either as long as they’re not redistributing the GPL-ed code. This use case motivated the even-stronger copyleft of the AGPL.

Eigen’s released under MPLv2. It was originally LGPL but they went through the laborious process of changing it. I’m not sure why.

So whever you get Stan, you’re getting something with an MPL component.

My concern isn’t so much avoiding any particular license, but in making it easy for everyone to use Stan.

What do you see as the advantage of something like MPL over BSD?

That would be awesome!

While it’d be great to have more developers working on Python plus Stan, I think it’d be more productive if they worked together, unless there were two radically different designs that were arguably both useful given the existence of the other.

Bob’s use case for just releasing under BSD is clear: there are people in industry stuck with industry lawyers who don’t want to deal with the complexity of using GPL code so they just ban GPL code (even if there aren’t any real implications to using the GPL code internally). We clearly can do this as Bob suggests so what are your reasons for not wanting to release under BSD (let’s say in parallel to GPL)?

I’m all for rebuilding PyStan/rstan from scratch once we move more things into core services code and have a shared binary format for interfaces and maybe some shared config code and shared writers. Practically I don’t know what sort of timeframe this can happen on so I’m going to continue working on IO and the math lib till then. I’d rather have it be a joint effort because we don’t have the people power to go off making multiple interfaces and having them come out as solid maintainable code.

1 Like

Hi Allen, Bob, and others. I was wondering, has there been any new updates on this since September? I can’t use or distribute any GPL dependencies in the projects I’m working on but obviously would love to use and support this project.