Rstan & StanHeaders merger? A heads-up!

Hi everyone!

As outline previously, updating RStan on CRAN is difficult due to having StanHeaders (includes Stan-math) and RStan (includes the parser) in separate packages. In the long run it would be much preferable to have everything in a single R package RStan. However, this would leave StanHeaders essentially defunct, but that is something we can’t afford as this would mean that all depending packages must change, which is unlikely given the inertia in all of this.

Instead we are considering to make StanHeaders a “proxy” package which essentially forwards the includes of Stan-math to respective includes contained in RStan. This would mean that in the future we have

  1. StanHeaders with a dependency on RStan
  2. RStan without a dependency on StanHeaders

Packages including things from StanHeaders a given file xyz.hpp would then get a forwarded xyz.hpp file living inside RStan.

Now… the difficult thing coming into play here is the license: Right now StanHeaders is BSD licensed while RStan is GPL licensed. In this new setup it’s not clear if StanHeaders can stay BSD. Despite the fact that any program which compiles wrt to Stan-math (contained in StanHeaders) will only contain BSD licences source code… the actual Stan-math sources would be pulled from within RStan, which itself is GPLed.

In short: StanHeaders might have to be re-licensed under the GPL. What exactly the situation is from a legal perspective is unclear at this stage, but we would need to find this out, possibly with the support of NumFOCUS? The question is if a proxy StanHeaders can possibly stay BSD or not. The “easy” solution is to re-license the proxy StanHeaders under the GPL (in this case anyone requiring BSD style Stan-math could just copy it from our GitHub and include a copy in his/her own package).

Tagging @SGB and @Stan_Development_Team for attention.

Best,
Sebastian, Ben, Hamada, Steve, Andrew J

7 Likes

Hi!

A short question. Why do we not release a new rstan r package (say called rstan2 or so) that solves these issues (ie includes stan headers) and then in practice make old rstan and old stanheaders deprecated, but pointing to the new package. This would most likely not break anyone’s current code (except casting a deprecated warning for packages using rstan).

Then the R package developers can change at their own pace to the new r package as dependency. We could probably keep the old rstan package around for quite a while.

Just a quick thought as an R package developer. Although I have missed some stan meetings so I might have missed some ongoing discussions. Sorry if that is the case.

/Måns

We discussed that, but did not follow this up as @bgoodri opted against it as I recall… I think this would essentially double the burden of maintenance. A prominent example which did this was gggplot → ggplot2 and after that ggplot2 was refactored from internal a lot. It’s more of a ggplot4 by now, but it’s still ggplot2.

(but yeah, this idea of a RStan3 was on the table)

Hi,

I was actually thinking about ggplot2 as a good example. I guess we would have a similar situation. If we would have a rstan2 with StanHeaders we would not need to do a rstan3 and so forth.

If it is a lot of extra maintenance, then that would be a problem. My guess, though, is that most users would move quite rapidly and hence the additional burden would go down quite quickly. Hence the long-term gain might be larger than the short-term loss. However, @bgoodri can judge this much better than me.

With kind regards
Måns

I suppose that there are packages out there that are BSD (or other non-GPL licence) and use StanHeaders: what’s the implication for them if StanHeaders were to become GPL?

1 Like

This is an interesting question… I am not an expert, but these packages would have to be GPLed as a consequence from this (is my take). To avoid that they could make a copy of Stan-math from GitHub and then use that instead. The GitHub copy is BSD.

In case there is broad interest one could consider releasing a StanHeadersCPP R package, which is only there to be BSD licensed (and it would not be used by RStan). I am not sure if CRAN would appreciate that… and upgrading this R package will become a headache quickly given that Stan is NOT meant to be C++ wise a stable thing. Stan is meant to maintain compatibility on the Stan language level. Not more.

We just discussed this at SGB and will touch base with NumFOCUS to get their input on the legal/licensing dimensions.

3 Likes

Great. Let us know if anything is unclear!

I see that there packages that import and link to rstan and are still MIT licensed. Example: CRAN - Package AovBay

Not everyone is handling licences matters correctly. I don’t know the details of the MIT though. One thing, which makes StanHeaders special is that stan model binaries are created on the users machine. In turn everything stays source code only until one actually compiles and runs a model. That is a subtle difference which can become key.

Rstan itself must be GPL, since it uses RCpp, for example (but rstan cannot be used in R without RCpp baked in).

2 Likes

Would it be possible to switch RStan from RCpp to the newer cpp11 package? Unlike RCpp, it uses the MIT license.

1 Like

Is this true for packages like rstanarm that dont require compiling on the users machine for Win/Mac? At least for Windows/Macs, these are built on the CRAN machines.

The fact that a binary must be created which munges in GPL code is the worst and most obvious sign of the GPL taking over. In this case nothing will work without the GPLed stuff and hence GPL will claim its ownership over the whole.

cpp11 sounds nice. I think that lib was too young at the time of writing rstan. Skimming over it, it does not support modules which are used in rstan as I recall. So maybe it’s still possible, but it is certainly a question of resources. If that would work and someone finds the means to do it, that sounds in principle great…but please - if anyone wants to do it - align with the broader team.

Hi all,

NumFOCUS is currently running the question in this thread past their legal team. I’ll keep you updated as we learn more.

This is probably a dumb question and if not, it has probably already been discussed somewhere and I just missed it:

Why not simply abandon rstan and StanHeaders and only continue development of cmdstanr?

That does not work for r packages which ship their own pre compiled model.

1 Like

And there’s no convenient way around that, given recent improvements in compilation times? For example, requiring users to compile the models they need only once per installation of such an R package? Integrating this into the installation itself might be hard and might not always make sense, so users would probably have to do so after installation (usually before they need a specific model for the first time). The rstantools package could provide a function for compiling only a subset of the models from a package. And the package itself could offer a wrapper for that rstantools function, but setting an appropriate default for the subset of models to be compiled. Of course, this wouldn’t be as convenient as rstan, but given the development resources that rstan and StanHeaders seem to take, it could be an acceptable compromise.

The compiler toolchain is a huge hurdle for many users. So it’s of huge value to users to have ready to install binaries for their platform.

2 Likes

The lack of functionality in cmdstanr to evaluate the log probability at specific parameter values makes the two packages I have that depend on rstan presently impossible in cmdstanr.

5 Likes