Let me preface this by saying I haven’t personally had any problems installing RStan on Mac OS X ever. I’m also skipping the quoting because Discourse ate my first draft, so I’m doing it offline.
My main goal is to make life easier for our devs and for our users. So I really do want comments.
Changing the language for the parser and AST from C++ to OCaml is motivated for entirely different reasons than install problems with C++. It would be nice to have a languag ein which we can quickly and easily develop new language features, with decent error messages, and semantics much more quickly than we can in C++. It would also be nice to transform intermediate representations for efficiency—this has been our to-do list for years, but is very very painful in C++. This is going to be critical if anyone wants new features in the language.
I’m trying to come up with a plan to address what I see as issues around RStan and PyStan development, release, and install that will be cost effective and robust going forward.
Specifically, I would like to:
-
cut down on the time it takes between Stan being done and a workable R solution to accessing Stan being released,
-
to remove the possibility of CRAN being in an inconsistent state,
-
to make it easier for users to install on all platforms,
-
to take measures to ensure we don’t break important downstream packages like Prophet,
-
to allow us to move forward with C++ compilers, Boost, and Eigen libraries independently of CRAN/R requirements,
-
to remove the amount of installation issue and forum traffic so that Stan doesn’t look so wobbly around installs.
This may not all be possible.
The installation issues for RSTan 2.18 have been across platforms and have come at different stages.
I had thought RStudio was going to release devtools as part of their new releases as soon as Rtools was up to a workable version. I thought JJ said that’d be about a year from when we met at Columbia. I probably misunderstood, though, if that’s not @bgoodri’s impression. Should I follow up to check, or won’t that matter?
Is there anyway to check that adding things like pkgbuild will have negative effects on downstream packages like Prophet?
I wasn’t clear on the C++14 recommendations. Were you recommending we stick to C++11? We can recommend C++11 until we move to C++1y features. There are a lot of things we’d like to use beyond C++1y, especially around things like polymorphic closures. @bgoodri—if you know about these things ahead of time, please bring them up so we can address them. This is the first I’ve heard (that I recall at least) about c++1y being an issue.
I didn’t realize that whoever maintained Rtools was asking for donations. If you want us to make donations, you’ll have to request donations. I’d rather not be called a deadbeat over some debt I didn’t even know I had. Are there other things like this?
Please don’t call consultants mercenaries. But yes, the plan would be to get help on installers for platforms, not just Windows, so that we could (a) be up to date with recent compilers, (b) be up to date with dependent packages, and © include tools not part of Rtools. The other thing we could do with our own installer is build a better Wizard that wouldn’t confuse all of our users and would do the right thing for Stan in terms of modifying makevars.
Those “Big Challenges in 2019” are the main reason we’d like to decouple the C++ compiler from R and ideally run out of process. I know it’s a huge change, but it seems like we don’t have the person power to keep up with what we need to do for R now.
I disagree that the rate of new Stan developers coming on line is constant. I think it’s growing with the Stan population, which isn’t exactly linear growth.
Figuring out how to support things like rstanarm going forward is why I’ve been emailing @bgoodri and @jonah to try to set up some meetings and why there are no entries under those things. Do you guys have something like a roadmap for the R ecosystem around Stan somewhere?
Before this draft was written, Sean and Matthijs have built example proofs of concept and made sure we could launch executables on all of our major platforms. There are two ways that systems do this now that I know of. RStan gives people instructions that they have to run outside of CRAN to install system tools before RStan will work. The tensorflow package builds in a script in their CRAN package to go and download things from the web. I like this latter appraoch as it worked really well for me to install the tensorflow package.
Why do we need packages like brms to run unit tests on CRAN? I think we should just assume that the external install works and provide tests for that the way we test everything else.
I intentionally broke out the Stan 3 language here as that’s not going to break backward compatibility. If (and it’s a big if) we move to something blockless, it still won’t break backward compatibility. But the first plan is to move the implementation so we can nail down all the other things we need to do, like add tuples, ragged structures, closures, etc. We’ll never get those done in C++. We’ll be able to rebuild the entire parser and code generator in OCaml and do at most one or two of those things before we’d be able to do one of them in C++.
We don’t need everyting to be in RAM to access transforms, etc., but it will be necessary to do it efficiently. I know we have that all in there now, but I have no idea who’s using it and for what. I’ve never gotten any examples of anyone using RStan or PyStan to develop algorithms through those exposures. I do know some people like to write Stan functions to make faster R functions, but that’s not our core use, so I don’t think it’d be terrible to not suppor that or to move it to something like an RcppStan package.
I guess I wasn’t clear enough that I wasn’t proposing getting rid of the existing RStan or PyStan interfaces. I’m proposing adding out-of-process versions that address problems with the current installations. These would be simple to write and would provide all the current RStan functionality, but some of it would be a lot less efficient.
Please don’t characterize other people’s proposals as making good April Fool’s Day fodder, no matter how appropriate you think the analogy is. We want to keep these forums polite.
If you want to call having a single easy-to-use, completely bundled installer “corporate”, go right ahead. I think corporations are doing things right in some ways, which is why people will continue to pay them for things. This is why Windows still exists—it’s the only platform that’s serious about backward compatibliity.
I’m not actually proposing that we get rid of the existing capabilities of building everything from source with a custom C++ compiler—I’m just proposing that we also encapsulate a bunch of stuff that we know our users need. Our users are for the most part not like our devs—they’re not managing multiple C++ environemtns—they usually don’t even have one.
Indeed, I’m bundling the library (Rcpp, BH, RcppEigen, StanHeaders, etc.) dependencies and compiler dependencies together. From the core Stan C++ developer (i.e., my) perspective, they are all the same—they’re restrictions on what we can use in our code. BH and RcppEigen and StanHeaders all provide dependencies in RStan releases.
I thought the inability to synchronize releases and version dependencies on CRAN led to StanHeaders and RStan getting out of synch. To not get out of synch with Boost and Eigen, we have to support their existing libs and their next libs until BH/RcppEigen switche over, and only then can we remove the old support. This is more of an issue for developers than users. I’ll try to keep the issues more separate in the future, since I seem to be confusing people here, which was not my intent.
The FOSS standard of everything working with everyting else is nice, but I don’t see how to make it work in practice given the resources and tools we have. As an aside, do FOSS purists shun Docker containers for the same reason that they are overly corporate in their approach to bundling?
I can’t quite reconcile R and RStan and how well they live up to those FOSS principles. At the very least, we should be testing more. Doesn’t RStan require the C++ to stick to the latest BH and RcppEigen and something ABI-compatible with whatever C++ compiler R was compiled with? If Python did the same thing, we might be in a place where we had to support two entirely different versions of Boost and Eigen and C++ (actually, I think we are there). (I think Python may do the same thing and we’re just letting PyStan break in most places—I think that’s part of the motiation for PyStan3, but I doin’t see how having a standalone http server is going to help with that.)
The problem for us isn’t so much that we don’t want to do the right thing but that we don’t have the support staff to pull it off.
I think it’s very unfair to say that we’re taking from the efforts of FOSS but not giving back. We’re giving back Stan! For the core Stan, we have been filing issues with Boost and with Eigen when they come up and are clearly bugs other than design decisions that make our life hard. Should we be donating dollars to all the tools we use? From the top down, that’s R, Python and Julia, g++ and clang++ and whatever’s going on in Windows, Boost, Eigen, Sundials, Rcpp, knitr, ggplot2, …? How much?