Different brms behaviors between a Docker image and the same image imported in singularity

[Note on cross-posting: I asked the same question at StackOverflow and have received the suggestion that I should ask it here. The question might be more on computational reproducibility through containerization than on Stan per se. Please let me know if the topic is not suitable for the forum.]

I have recently started using Docker to secure the computational reproducibility of my research using Stan (often via brms). Since the HPC service at my institution only supports singularity, I want to import a Docker image within singularity when I perform part of my analysis using the HPC. When I did this, however, I found that the results based on the original Docker image differ from those based on the Docker image imported in singularity.

Here is what I did to build a simple Bayesian regression model based directly on a Docker image. This was run locally and also on an instance at AWS, resulting in identical output (as expected).

docker pull akiramurakami/gramm-mor:v1.0
docker run -it akiramurakami/gramm-mor:v1.0 bash
Rscript -e 'library("brms"); library("tidyverse"); set.seed(1); d <- tibble(x = rnorm(100), y = 2 * x - 1 + rnorm(100)); m <- brm(y ~ x, data = d, seed = 1); summary(m)'

Below is part of the output.

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -1.04      0.10    -1.23    -0.85 1.00     3812     2469
x             2.00      0.11     1.79     2.21 1.00     4625     3037

Here’s what I did on HPC, using singularity.

singularity pull docker://akiramurakami/gramm-mor:v1.0
singularity exec gramm-mor_v1.0.sif Rscript -e 'library("brms"); library("tidyverse"); set.seed(1); d <- tibble(x = rnorm(100), y = 2 * x - 1 + rnorm(100)); m <- brm(y ~ x, data = d, seed = 1); summary(m)'

And the results are different (see Bulk_ESS and Tail_ESS columns).

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -1.04      0.10    -1.23    -0.84 1.00     3798     2826
x             2.00      0.11     1.78     2.22 1.00     4275     2913

Why is this and is there a way to import and use a Docker image in singularity so that it yields the same results as those based on the original Docker image?

Below is the Dockerfile used.

FROM rocker/r-ver:3.6.3
LABEL "maintainer"="xxx"

RUN apt-get update -qq && apt-get -y --no-install-recommends install \
  file \
  git \
  libapparmor1 \
  libclang-dev \
  libcurl4-openssl-dev \
  libedit2 \
  libssl-dev \
  lsb-release \
  multiarch-support \
  psmisc \
  procps \
  python-setuptools \
  sudo \
  wget \
  libxml2-dev \
  libcairo2-dev \
  libsqlite-dev \
  libmariadbd-dev \
  libmariadbclient-dev \
  libpq-dev \
  libssh2-1-dev \
  unixodbc-dev \
  libsasl2-dev \
# https://github.com/stan-dev/rstan/wiki/Installing-RStan-on-Linux
RUN Rscript -e 'dotR <- file.path(Sys.getenv("HOME"), ".R"); \
  if (!file.exists(dotR)) dir.create(dotR); \
  M <- file.path(dotR, "Makevars"); \
  if (!file.exists(M)) file.create(M); \
  cat("\nCXX14FLAGS=-O3 -march=native -mtune=native -fPIC","CXX14=clang++",file = M, sep = "\n", append = TRUE)'

RUN Rscript -e 'options(repos = list(CRAN = "http://mran.revolutionanalytics.com/snapshot/2020-07-01")); \
  install.packages(c("brms", "data.table", "devtools", "SnowballC", "tidyverse", "dplyr"))'
1 Like

Could you install Singularity & Docker on the same (presumably local) system and run it on both there?

I have been trying to do that but have not been successful.

When I install the desktop version of singularity for OS X, pull the Docker image, and run singularity exec gramm-mor_v1.0.sif Rscript -e 'library("brms) …, I get the following error:

bash: -c: line 0: syntax error near unexpected token `('
bash: -c: line 0: `singularity exec -H /host//Users/mrkm_a:/Users/mrkm_a --pwd /Users/mrkm_a/Docker /dev/sda Rscript -e library(brms); library(tidyverse); set.seed(1); d <- tibble(x = rnorm(100), y = 2 * x - 1 + rnorm(100)); m <- brm(y ~ x, data = d, seed = 1); summary(m)'
[ 1.512542] reboot: Power down

The same error occurs with simpler R code, so the Docker image has perhaps not been correctly converted into a singularity image.

When I install Singularityware Vagrant Box and follow the same procedure, it appears to freeze while compiling Stan code (Compiling the C++ model).

While pulling the Docker image in the 2 above, however, I got the warning message that says pull for Docker Hub is not guaranteed to produce the same image on repeated pull. Use Singularity Registry (shub://) to pull exactly equivalent images. So I now have a feeling that singularity cannot be expected to yield the same output as the original Docker image to start with.

A container image converted to singularity should still provide the exact same software down to the glibc version, but the kernel can still be different (likely older on the HPC system), though it’s hard to imagine the kernel affecting your results.

This warning is related to the fact that a Docker hub image like foo/bar always implicitly has the tag latest (I.e. foo/bar:latest) and if a push happens between pulls the resulting pulled images can be different.

I think the syntax error you have could be related to string quoting. Did you try putting the script into a file and just running Bash in the singularity container to invoke the script?

The singularity conversion shouldn’t affect the content of the image, it is just unpacking the Docker tar archive and repacking it.

I tried to run your image since I have a system with both Docker and Singularity, but it fails on loading Rcpp:

Loading required package: Rcpp
 *** caught illegal operation ***
address 0x7ff8bdf21f70, cause 'illegal operand'

I guess it’s your build flags that generate code for the latest CPU you have. For reproducibility you may want to make those flags a little milder using just plain -O2. The code will be somewhat slower but less likely to behave differently on different CPUs.

Unfortunately, exact reproducibility of highly optimized numerical stuff across machines is hard, see https://mc-stan.org/docs/2_24/reference-manual/reproducibility-chapter.html for some more details. I personally try to aim not for exact reproducibility, but for “good enough” reproducibility, which is IMHO much easier. Basically:

  • avoid binary decisions (e.g. claiming effect “exists” if 95% posterior CI excludes zero) - at least unless there is a huge margin. Binary decisions are usually bad practice anyway and continuous quantities are usually more useful (e.g. posterior probability the effect is larger than some “smallest effect size of interest”).
  • run more iterations to reduce Monte Carlo error, especially if you are interested in tail probabilities (like the 95% CI)
  • use more chains to reduce variability from warmup/init

If you do this, the chance that your results change in an important way with different seed/machine/compiler/Stan version becomes much smaller than say the chance you have a coding error in your data.

Does that make sense? Obviously feel free to pursue exact reproducibility if you need to - we will definitely try to help, but just wanted to give some context :-)

1 Like

The problem is that in the first case (desktop version), even R isn’t recognised:

mrkm_a$ singularity exec gramm-mor_v1.0.sif R -f singularity_test.R
/.singularity.d/actions/exec: 9: exec: R: not found

Similar is the case when I run singularity shell gramm-mor_v1.0.sif and type in R. Also, much simpler R code without string quoting (like rnorm(10)) does not run there, either. So I think the issue is with the singularity image itself.

Now that I think about the issue again, if hardware like CPU matters in the reproducibility of Stan results, might it be natural to expect different results on different machines even with the same Docker image? I somehow thought the same Docker image should always lead to the identical results of Stan, but I may be wrong.

Thank you for your suggestions. I think I now agree that I should perhaps not aim at computational reproducibility but be satisfied with the replicability in terms of the inferences drawn from the analysis. Am I right in understanding that the fact that hardware matters means that the same Docker image may lead to different results on different machines? If I am, I wonder why I got identical results on my local machine (OS X) and an instance at AWS (RHEL). I thought it is too good to be a pure coincidence, and thus thought the difference between my local machine and HPC must be due to the difference between container types (i.e., Docker vs singularity). Could this be because the relevant hardware setup was similar between my local machine and the AWS instance?

1 Like

It seems like the image translation didn’t handle the paths in the Docker image correctly, which is surprising. I have had success building Docker images locally and running them on Singularity on HPC sites before. You may also consider building your software as a Singularity container directly, since it has it’s own format.

If you a looking for exact reproducibility, there would be fewer variables to think about with a virtual machine. Vagrant makes producing a virtual machine about as hard as a Docker container.

But you should consider

where you might also do a series of runs with increasing iterations and see convergence of the results across platforms (local Docker vs remote HPC), i.e. the 95% CI overlap should asymptotically approach 100%. (Containerizing can still help manage software versions with this.)

I honestly don’t know: I am not an expert on this and I know only a little more than what is at the link I shared. I guess that the most likely source of variability is compiler/compiler settings and the C runtime used - most machines would have very similar Intel hardware.

I also heard that in gaming some multiplayer implementations do “lockstep”, i.e. the game physics etc. are computed in parallel on all player’s machines and should behave exactly the same. This is considered challenging within platform and almost impossible across platforms. I’ve heard of a team that tried to make it happen and they reportedly had to roll out their own implementation of geometric primitives and maybe even sqrt…

C runtime and math libraries usually account for the root of the problem, but Docker and Singularity images come with their own, so this would explain why AWS and local Docker are the same, but not the difference with singularity.

What version of singularity are you using?

That is one option I’m thinking about as well, although no singularity image of R appears to be as well developed as rocker, and I perhaps need a fair amount of work to build up the singularity image that is comparable to my current Docker image.

Thank you for this suggestion. I agree that these are the things I should think about once I give up exact reproducibility.

The version of singularity appears to be

  • 3.6.1 on HPC,
  • 3.3.0 for the desktop version, and
  • 2.4 for the Vagrant version.

This might explain why I could at least run my converted Docker image on HPC but not locally.

You might consider building singularity with the same version across these systems, to help reduce variability. Version 3 is a single Go binary IIUC which you should be able to copy paste to another system to use, without admin rights.

I think exact reproducibility is worth thinking about, in addition to other forms mentioned before, but it’s just a little harder.

1 Like