I think I remember @sakrejda mentioning in a Stan meeting that he was working on a cmake version of Stan (not sure which repo this was, maybe math?). Is that still going on? I’ve been wondering how difficult it would be to switch to a more modern system in the hopes of speeding builds up and doing incremental builds safely more often (esp. for testing). I’ve heard Bazel highly recommended by people, though they typically have had some Google affiliation in their past. I have no personal experience with any of the C++ options outside of make so curious what others think.
The issue here is user-friendliness – the current make system has been specialized sufficiently well that it works out of the box for most systems we encounter, but we’ve had no end of trouble with tools like cmake. If we can find something that works better than make, works out of the box on Linux, Mac, and Windows, and we’re willing to translate the current build configuration over then I think most of us would be receptive.
I did get cmake working in a few different contexts but once I was done with it it turns out I needed some command-line invocations in there to get the protocol buffers and Stan models build. The error messages are terrible (it’s an auto-generated make-file so… :/ )
I will give Bazel a shot at some point but I tried make/cmake/R’s remake and one of the Python-based tools and all of them have weird stuff embedded. For straight C++ I’d go with Bazel or cmake but Stan isn’t straight c++ (we sometimes auto-generate c++). Maybe going by the Bazel protobuf scripts we could do it right in that system but I haven’t had time to push it.
Others have also tried to port our makefiles to cmake and eventually gave up.
My own experience installing third party software is that cmake almost always fails in cryptic ways. Whereas makefiles usually work for me.
I’m pretty sure the bottleneck in our testing is because of dependencies, not because make is slow or bad at figuring them out.
Because the tests bring in high-level headers (like prim/mat.hpp), any changes to any included file in mat.hpp requires us to rebuild the world. I’d be surprised (but very happy!) if another system could get around this in a cleverer update only if the things you use in a header change, not if the header itself changes.
Another reason things are so slow is that Stan’s largely header only and the template metaprograms are very slow to compile.
A third reason is the combinatorics in the probability functions. We’re not comfortable with fewer tests because the underlying code can branch on any of the distinctions in input type to have different behavior. They’ve saved us on many occasions.
The appealing feature of cmake is that you can specify c++ builds really cleanly but the moment you have to do something more complex (e.g.-our testing) the scripts turn into a nightmare of custom_command/target thrown all over the place.
I’d be happy if we could move from make. I’ve thought about this for a while. Here’s what we need to do:
Math. For developers only:
Build CVODES library
Build GTEST
Be able to generate tests
Build individual unit tests
Build doxygen doc
Stan. For developers only:
Build CVODES library
Build GTEST
Build a test version of the Stan compiler
Build individual unit tests
Build doxygen doc
Build latex manual
CmdStan. For users:
Build the Stan compiler
Build stansummary
Build CVODES library
Pass Stan program through the compiler, then compile the generated C++
CmdStan. For developers:
All of the above
Build doxygen documentation
Build manual
It’s easy to require developers to install more tools. For the users, we should think about dependencies. The way we currently have our CmdStan build system, the dependencies are:
Oh, I think maybe I misinterpreted what you were saying - for the CmdStan users you meant users who are compiling CmdStan, not users who are using CmdStan. Are there a lot of people who do that?
CmdStan users need to have make to run CmdStan. If they have a Stan program, say foo.stan, they need to be able to create an executable from foo.stan. This requires:
Ah, right, sorry. So using CmdStan implies compiling the rest of Stan and the math library? Hm, yeah it would suck a bit to force users to install something like Bazel to use CmdStan…
In broad strokes, yes, that’s what it implies. We could add a shell script and a Windows batch script to compile if we even wanted less dependencies. But I find that make is typically installed with whatever toolchain is available.
Most of Stan is still header-only, so we just need to include Stan and Math libraries. CVODES is the exception, and in fact, we can compile without the library if necessary (and if we don’t use the CVODE ODE solver). We tend to just build it because we’ve figured out that process reliably.
@seantalts@danluu If the cmake thing has legs, I’m still cheering for it. I am pretty intrigued by the idea that CLion can make it super easy for me to navigate the math library if it has a CMake file to go from.
Ofc. changing Stan’s build system so I can try a commercial IDE to make up for my lack of emacs-fu is pretty terrible reasoning, but whatevs…
It’d be easy replacing CmdStan’s build process. We could start there.
Stan’s a little complicated due to building the compiler (needing to instantiate templates), but should be relatively straightforward with any build system that handles C++.
Math is complicated due to our testing, but it’s straightforward.
I’m happy to talk through the build requirements for any of those three libraries. If someone knows how to use a build tool really well, we can switch.
Unfortunately, I’ve been doing non-Stan stuff lately, but the last time I tried, the cmake branch of math seemed fine (I was using it for my own development, CLion worked fine with it, etc.) with the caveat that it didn’t build the distribution tests. It’s possible that it needs to be updated to get it back to its previously working but not finished state.
The next time I have time to do Stan stuff, I think I’m going to try to finish the benchmarking I started (unless someone else already did it), and then I may take a look at this afterwards, but there’s a lot of potential work, so no promises :-).
@syclik do you think it could be worth merging the cmake stuff we have since that provides IDE support? Or do we want to wait for it to be feature complete with make so we aren’t maintaining two systems? I can see both sides here, though am a little prone to think the cmake branch might have more impetus if we check it in…