I want to stress that I have nothing against supporting pinned versions of dependencies. Just that I want to make it easy for people to use their own dependencies without “hard-coding” them in by including the sources in the repository.
(In fact, we can go even further and try for fully reproducible builds - by pinning versions of our compilers and build tools as well.)
This is not a big deal when it comes to OpenCL headers. It does become an issue when people want to use their own version of Boost with Boost MPI, or Eigen/Sundials with acceleration support, etc. So these headers are not a hill I want to die on, mostly just a consistency issue with how I want to treat other Stan dependencies. In general different users of Stan will want to use different versions of dependencies, and I would like to make it easy. Another example: for CRAN reasons we patch Sundials’ fprintf function. This affects all users of Stan. In the future, we allow the use of any Sundials; and only the people who package Stan for CRAN worry about stripping all print statements.
(This will not be more inconvenient for them. Conan makes it easy for pinned versions of arbitrary dependencies to be installed.)
So, my preference on how to resolve this would be to write configuration logic that checks that the OpenCL header has the required wrapper functionality and spit out a helpful error, rather than include the headers in Stan.
These OpenCL headers are weird in that the version numbering is somewhat useless in this case. You can support OpenCL 1.2.0 and not implement these helpful wrapper functions, which means compile errors. Argh. Still, I would rather write a few lines of CMake code that does a compilation check against std::string kernels as opposed to shipping the headers in the source tree. What do you think?
We have run into problems in the past with incompatibilities with system-installed boost versions vs ones we distribute. After dealing with that a bit, we really stopped relying on that as a good way to handle dependencies.
As much as we’d like to be optimistic and think that libraries behave well across versions, when that fails, it’s really difficult to debug. Especially by users that may have had stuff installed for them by other packages.
I am sympathetic to the idea that nobody wants to debug dependency issues. The default build of the monorepo uses pinned versions of dependencies that are downloaded via Conan and compiled from source. However if somebody wants to override these dependencies it is one CMake compile-time definition away.
The issue with the OpenCL headers here is that the include prefix has been erased and everything merged into one mega-header file, so that even if somebody did want to compile against their own vendor OpenCL headers they can’t without patching the Stan sources. I can’t use Conan to manage this dependency because the file structure is completely different from the official Khronos header files.
There’s already a Conan package that tracks the official Khronos repo, and I’d like to use a pinned version of it.
We had the discussion in the past if we want to use libraries which are packaged with stan or the system libraries. From my memory reasons for what we have (including libraries) is:
no problems with debugging
no ambiguity whenever users report trouble
ease of installation for users and developers
ease of modifying the libraries as needed
These are all good reasons to go with distirbuted libraries and the default should still be to have the pinned libraries being used as before. Now, having the option to use system libraries would be really good - I totally agree. In that case one needs a bit of care though how our edits to the libraries are handled. Any suggestions on that?
Boost lgamma: afaik the fix has been upstreamed so theoretically all I need to do is point Conan at Boost’s develop branch and run all the unit tests to make sure it works (still need to check this, though, working on it today).
GTest: I got rid of the MPI #ifdefs and wrote a MPI-specific test runner executable.
Sundials: I forked Sundials and patched the printfs out, and made a Conan package out of it. RStan packagers can use this, everyone else can use the regular 4.1.0 release.
OpenCL: being discussed in this thread.
Everything else: I’m not aware of any special edits we have.
For the sake of argument let us suppose Boost develop branch in 1) is no good. Then I would prefer to handle it like Sundials: I write a little Conan script that downloads the sources, patches them, and uploads them to Bintray. Basically a Conan version of our current update-xxx.sh. Users add my personal package archive to Conan and conan install automatically installs the dependency. Eventually when monorepo is transferred to the main Stan repo Stan maintainers create their own official “Stan” organization on Bintray and associated PPA. There’s also the possibility of upstreaming Sundials to the Conan package maintainers.
Specifically the cl.hpp we use is actually the cl2.hpp file from this repository. We can move it in a CL/folder and rename it back. I think that should not be problem. Would you oppose that move @stevebronder? Is the link to the repository enough for Conan? Or do we need to make a separate repository with a Conan package? If that is the case I can do it on our Univ. of Ljubljana Bayesian Stats Github group. It does feel redundant to make a repo for a single file to be honest. And to make Stan Math dependent on some groups repository, idk.
I’m fine with that but I’m not sure it solves the issue
@alashworth The link you sent is for the C headers, maybe we could talk to those folks about also hosting the C++ headers. It feels like that would make sense
As far as I understand we would use the supplied Conan package except if the user manually specifies to use the system headers. In that case we use the system headers. Is my assumption correct?
It would help make this case easier, if my assumption is correct.
@alashworth just wanna clarify some stuff so we are on the same page (the OpenCL stuff can be weird and confusing because you have to use drivers and stuff to touch the device and we use a higher level C++ API to touch the C API that touches the device). Rok is much better at explaining this stuff than me but I figured I’ll give it a go
So there’s a couple things people need for OpenCL and we have all those listed here for various systems. We expect users to have an Installable Client Driver (ICD)[1] which is the driver that defines how to make calls to execute instructions on data to an OpenCL compatible device (whether it be a cpu, gpu, fpga, apu, yada yada). It’s just a .so file, idk why the OpenCL folks need to make it sound so special but they call it an ICD for whatever reason. So when you see -lOpenCL that ICD is the shared object file (like opencl.so.1 or whatever) we are linking to.
That ICD exposes a C API from the host (user) to the device. Some people want as little between them and the device as possible so they use the OpenCL C API to do their stuff. But imo working in C can kind of suck for things that are complicated. So Khronos has a C++ API, which is really just a wrapper around those C API calls. So you can think of the header file in the opencl_2.1.0 as a nice wrapper around those C API calls.
Right now, we expect users to supply the ICD, which we expect to be defined in their path right so we can do that -lOpenCL thing and link to the actual device API provided by the ICD. idk how automated build systems work with that, if we can check they have that installed or install it for them then awesome! The OpenCL C++ header files we need to include to compile the code, before any linking to the .so is done, is at the link I posted above (and here)
Does that all make sense? it’s a weird thing for sure relative to the rest of our dependencies. Looking at the link you posted earlier about FindOpenCL it looks like we want OpenCL_LIBRARIES to come from that function, but the OpenCL_INCLUDE_DIRS should come from the conan package I linked above.
(Also Rok we should update that page with the functions that have OpenCL support because I’m sure people land there)
[1] What I said about the ICD is not totally 100% fact but is essentially close enough, you can get more info on that here
That clears up things, yep. My issue before actually was misunderstanding the clhpp includes. I guess everything was working fine since my system nvidia install provides cl2.hpp, and the CMake build system was adding the include directory automatically. At the same time I had a distro OpenCL package installed. When it got renamed to cl.hpp the autoconfigure scripts were picking up my ancient distro opencl-only headers first (whatever is first on the user’s path is what is picked up), and then - compile issues.
So renaming things to be like the standard khronos headers would solve my problems! And on top of that I need to write a little CMake module that searches for cl2.hpp. Thanks!
Yep, someone just opened a cmdstan issue asking something along these lines (not sure if this comment is due to that issue or just a coincidence). Also some researchers were asking on twitter, etc. I keep referring them to the arxiv paper, but even that is outdated now. I will add it to my TODO after I finally finish the GPU GLM compiler support.
Fantastic. I will make an issue. So if we put it back to CL/cl2.hpp its done?