Run multiple stan models in parallel

I am “lucky” to have access to a remote Rstudio server with 80 cores and I would like to use this server to run 5 stan models in parallel (each model runs 4 chains, so I would use 20 cores). I have tried the following:

library(parallel)
tasks <- list(
  fit1<-function() sampling(model1,standata,cores=4),
  fit2<-function() sampling(model2,standata,cores=4),
  fit3<-function() sampling(model3,standata,cores=4),
  fit4<-function() sampling(model4,standata,cores=4),
  fit5<-function() sampling(model5,standata,cores=4)
)

out <- mclapply(
  tasks,
  function(f) f(),
  mc.cores = 20
)

When I call mclapply, model1 is properly sampled but I get some errors for the other models:

Warning message:
In mclapply(tasks, function(f) f(), mc.cores = 20) :
  scheduled cores 3, 2, 4, 5 encountered errors in user code, all values of the jobs will be affected
Error in socketConnection(master, port = port, blocking = TRUE, open = "a+b",  : 
  impossible to open the connection
Calls: <Anonymous> ... tryCatchOne -> doTryCatch -> recvData -> makeSOCKmaster
De plus : There were 17 warnings (use warnings() to see them)
Stopped execution
Error in socketConnection(master, port = port, blocking = TRUE, open = "a+b",  : 
  impossible to open the connection
Calls: <Anonymous> ... tryCatchOne -> doTryCatch -> recvData -> makeSOCKmaster
De plus : There were 17 warnings (use warnings() to see them)
Stopped execution
Error in socketConnection(master, port = port, blocking = TRUE, open = "a+b",  : 
  impossible to open the connection
Calls: <Anonymous> ... tryCatchOne -> doTryCatch -> recvData -> makeSOCKmaster
De plus : There were 17 warnings (use warnings() to see them)
Stopped execution
Error in socketConnection(master, port = port, blocking = TRUE, open = "a+b",  : 
 impossible to open the connection
Calls: <Anonymous> ... tryCatchOne -> doTryCatch -> recvData -> makeSOCKmaster
De plus : There were 17 warnings (use warnings() to see them)
Stopped execution

Does anybody know how I can fix this?

Thank you in advance!

Why don’t you run 5 R sessions independently?

1 Like

So far as I know, this isn’t possible with rstan. Let me take a look at what it would take to permit it with ezStan; I think it should be just a matter of giving each stan_temp folder a unique name…

Oh, though Andre’s suggestion of simply running separate R sessions should for sure work if you can do that.

Edit: I spoke prematurely and this is possible in Rstan with the tweak suggested by Charles below.

you need to initialise the stanfit objects on each worker, via a call something like this:

sf<-rstan::sampling(model,iter=0,chains=0,init=0,data=data,check_data=FALSE,
        control=list(max_treedepth=0),save_warmup=FALSE,test_grad=FALSE)
1 Like

Thank you very much all for your help and suggestions.

@andre.pfeuffer, I cannot (unfortunately) open several R sessions on the remote server.

@mike-lawrence and @Charles_Driver, I tried to initialize the stanfit objects before sampling but it did not help (it might be that I did not call the sampling the right way…)

Finally, I tried to use parLapply instead of mclapply and it did work!
For readers who might be interested to run in parallel stan models on the same dataset, here is the code:

samplingfunction<-function(x){
  if (x==1) res<-rstan::sampling(model1,data=standata,cores=4)
  else if (x==2) res<-rstan::sampling(model2,data=standata,cores=4)
  else if (x==3) res<-rstan::sampling(model3,data=standata,cores=4)
  else if (x==4) res<-rstan::sampling(model4,data=standata,cores=4)
  else if (x==5) res<-rstan::sampling(model5,data=standata,cores=4)
}

cl <- makeCluster(20)
clusterExport(cl,c('model1','model2','model3','model4','model5','standata'))
out <- parLapply(cl, c(1:5),samplingfunction)

2 Likes