Separating compilation & sampling with brms on cluster

wpetry · May 21, 2019, 5:45pm

I want to fit a brms model on a Linux cluster. The system I am working on only has a C compiler on the head node, not on compute nodes (this seems non-negotiable). I suspect that separating compilation (on head) and sampling (on compute) ought to be possible based on a previous RStan post on this forum. I think I need to compile the model and write out an intermediate file on the head node, then load the intermediate file and sample on the compute node. Furthermore, I suspect this “intermediate file” is the dynamic shared object (DSO) referred to in help("brm").

How do I compile the model without starting sampling, write out the DSO, then load it back into an R session on another node?

Attempts towards an answer:
Here’s a sample model based on the brms docs that I am using as a testbed:

# Normal model with heterogeneous variances
data_het <- data.frame(
  y = c(rnorm(500), rnorm(500, 1, 2), rnorm(500, 50, 50)),
  x = factor(rep(c("a", "b", "c"), each = 500))
)

# Fit model
fit <- brm(bf(y ~ x, quantile = 0.25), data = data_het,
           family = asym_laplace(), chains = 8)

I’ve tried to prevent sampling by setting iter = 0, warmup = 0. Doing so throws an error, but does seem to have compiled without sampling.

> fit
 Family: asym_laplace 
  Links: mu = identity; sigma = identity; quantile = identity 
Formula: y ~ x 
         quantile = 0.25
   Data: data_het (Number of observations: 1500) 

The model does not contain posterior samples.

Operating System: Springdale Linux 7.6 (Verona)
brms Version: 2.8.0

Rehab · May 21, 2019, 9:17pm

Hi wpetry

I haven’t tried brms package before.

But, I think you do not need to include your data and the fitting statement in your compilation file. The compilation file should only include the specification of your model.

Then, use another R file that first has a statement to call “load” your compiled file and then has your data and the fitting statement.

Hope this helps.

paul.buerkner · May 21, 2019, 9:24pm

First prepare the model as follows:

fit_empty <- brm(..., chains = 0)

Then, you save fit_empty and load it on the compute nodes. There, you call

fit <- update(fit_empty, recompile = FALSE, ...)

where ... contains sampling arguments such as chains etc.

wpetry · May 22, 2019, 2:30am

Thanks @paul.buerkner, this worked beautifully. In retrospect, chains = 0 makes more sense to avoid sampling than iter = 0.

I wonder if adding something to the reference manual entry for brm would be useful for others. Perhaps under the chains argument adding something like: “Setting chains = 0 will compile the model without sampling, which may be useful on systems where compilation and sampling must be separated” and adding under iter “Must be a positive integer. See chains for compiling model without sampling.”

Topic		Replies	Views
Error in compiling brms model on cluster brms	2	478	July 11, 2019
Fitting same model, multiple dataset without recompiling c code brms	6	2430	July 11, 2018
Brms keeps recompiling C++ model in every run brms	3	2011	June 28, 2018
Problem running brms on cluster brms brms	8	1843	August 22, 2021
Brms without recompile sample RStan	4	1223	January 25, 2018

Separating compilation & sampling with brms on cluster

Related topics