I agree wholeheartedly! :) Some thought should go into what packages should be included as default, i.e., rstan, rstanarm, brms, bayesplot, etc. But also, should we included Julia, Python etc, or should we have many different flavors of Docker files instead?
I don’t know. Needs to be worked out what’s best. I would like as slim images as possible per interface (still including useful utilities per interface). We should take advantage of the fact that you can derive one from the other image, etc.
Love this idea! Tensorflow has these on dockerhub and the dockerfiles here. I think we could make a standocker repo that holds these. What are all the flavors we would want? It would be nice to have ones with the OpenCL stuff pre-installed. One for rstan, cmdstan, etc?
I recently made a docker image with rstan v. 2.22, brms, rstanarm, and various other related packages. It’s built on a Ubuntu base and also has the tidyverse R packages.
The repo is crpeters/docker-stan:apt-0.1.
docker has security implications which make it pretty unusable on clusters (the docker group essentially has superuser access). Also as I found out to my own detriment, on laptops, running docker containers quickly becomes a bloated mess. Some containers could pretty easily be derived from the build system, but maintaining them (w.r.t. base images) might be complicated. Also, most package level dockerfiles are based off Alpine linux, which will need to be tested for the dependencies.
Finally, docker virtual systems might be lighter than containers, but they still operate out of a pretty limited resource pool, so performance is going to be much worse.
Another (more elegant, in my very biased opinion) solution is to create a nixpkg derivation for Nix. These are reproducible, and have no overhead in terms of performance. They do not however, run on windows well.
My university’s HPC systems use Singularity, which is compatible with Docker images and resolves the superuser issues. Performance-wise, it doesn’t seem to be substantially different from running code natively, and installation is quite a bit easier.
I know about the admin right issues of Docker. It’s just that Docker is a defacto standard as I perceive it. Alternatives - such as Singularity - can directly parse Dockerfiles; just like other alternatives.
A package manager like nix sounds great - it’s just not a real solution if Windows is not on the list. It would still be useful to document these things on our wiki as it is surely helpful for a (admittedly large) subset of Stan users.
I really think we need a stanhub on docker hub. It will make installation, first playing with Stan or even Stan mode development a lot more streamlined. The concern about performance is not as hard to me, since this is for getting things going still super valuable. For high-performance on clusters there is either Singularity or you should anyway spend the effort to deal with the install or - even better - you have a cluster admin around who can help you.
Fantastic. I have been messing around with running stan but haven’t tried out the nix+docker generation before. This would make a great starting point for generating the Dockerfiles if Stan does go with a stanhub approach on Dockerhub.
I see that Generable currently holds the “stangroup” login on DockerHub. There are a number of images for Stan variants but many of them are pretty old. If there are images that people use regularly, it would make sense to pool those best practices and throw up some images on DockerHub so they aren’t hidden in someones personal GitHub repo.
Hey, the repository is up for a while but not used much, I would love to bring it up to more use. I think it’s very convenient for any developer to just pull a docker image and work with it.
We need to come up with a list of all projects that can be dockerized so I can integrate that process into Jenkins CI/CD to have them automated and synchronized with GitHub state/releases. A bit of order and naming conventions should keep it easy.
I can help dockerize some of our projects but never tried nix.
I’m up for this if I can get a bit of help with the planning of projects and if there are any restrictions or conventions that need to be followed.