[Note on cross-posting: I asked the same question at StackOverflow and have received the suggestion that I should ask it here. The question might be more on computational reproducibility through containerization than on Stan per se. Please let me know if the topic is not suitable for the forum.]
I have recently started using Docker to secure the computational reproducibility of my research using Stan (often via brms). Since the HPC service at my institution only supports singularity, I want to import a Docker image within singularity when I perform part of my analysis using the HPC. When I did this, however, I found that the results based on the original Docker image differ from those based on the Docker image imported in singularity.
Here is what I did to build a simple Bayesian regression model based directly on a Docker image. This was run locally and also on an instance at AWS, resulting in identical output (as expected).
docker pull akiramurakami/gramm-mor:v1.0
docker run -it akiramurakami/gramm-mor:v1.0 bash
Rscript -e 'library("brms"); library("tidyverse"); set.seed(1); d <- tibble(x = rnorm(100), y = 2 * x - 1 + rnorm(100)); m <- brm(y ~ x, data = d, seed = 1); summary(m)'
Below is part of the output.
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -1.04 0.10 -1.23 -0.85 1.00 3812 2469
x 2.00 0.11 1.79 2.21 1.00 4625 3037
Here’s what I did on HPC, using singularity.
singularity pull docker://akiramurakami/gramm-mor:v1.0
singularity exec gramm-mor_v1.0.sif Rscript -e 'library("brms"); library("tidyverse"); set.seed(1); d <- tibble(x = rnorm(100), y = 2 * x - 1 + rnorm(100)); m <- brm(y ~ x, data = d, seed = 1); summary(m)'
And the results are different (see Bulk_ESS
and Tail_ESS
columns).
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -1.04 0.10 -1.23 -0.84 1.00 3798 2826
x 2.00 0.11 1.78 2.22 1.00 4275 2913
Why is this and is there a way to import and use a Docker image in singularity so that it yields the same results as those based on the original Docker image?
Below is the Dockerfile used.
FROM rocker/r-ver:3.6.3
LABEL "maintainer"="xxx"
RUN apt-get update -qq && apt-get -y --no-install-recommends install \
file \
git \
libapparmor1 \
libclang-dev \
libcurl4-openssl-dev \
libedit2 \
libssl-dev \
lsb-release \
multiarch-support \
psmisc \
procps \
python-setuptools \
sudo \
wget \
libxml2-dev \
libcairo2-dev \
libsqlite-dev \
libmariadbd-dev \
libmariadbclient-dev \
libpq-dev \
libssh2-1-dev \
unixodbc-dev \
libsasl2-dev \
clang
# https://github.com/stan-dev/rstan/wiki/Installing-RStan-on-Linux
RUN Rscript -e 'dotR <- file.path(Sys.getenv("HOME"), ".R"); \
if (!file.exists(dotR)) dir.create(dotR); \
M <- file.path(dotR, "Makevars"); \
if (!file.exists(M)) file.create(M); \
cat("\nCXX14FLAGS=-O3 -march=native -mtune=native -fPIC","CXX14=clang++",file = M, sep = "\n", append = TRUE)'
RUN Rscript -e 'options(repos = list(CRAN = "http://mran.revolutionanalytics.com/snapshot/2020-07-01")); \
install.packages(c("brms", "data.table", "devtools", "SnowballC", "tidyverse", "dplyr"))'