Parallelising chains in cmdstan batch script [Linux]

I am running a model estimation with cmdstan on a linux machine. I sent a batch script to the machine and it started to run the first chain. However, according to the output file, the second chain is not running. I am wondering if I have to adjust the number of nodes or the number of tasks per node in the batch script. So far, I only increased the number of CPUs in the script to 2, because I want to run 2 chains in parallel.

Here is my script:

#!/bin/bash -l

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=8GB
#SBATCH --time=336:00:00
#SBATCH --output=%x-%j.out

module load gnu/7.4.0

cd $HOME/cmdstan-2.25.0/examples/bernoulli
for i in {1..2}
do
./bernoulli sample algorithm=hmc engine=nuts max_depth=10 num_samples=2000 data file=bernoulli.data.R
output file=output_${i}.csv &
done

What do I need to adjust to make the chains run in parallel?

  • your output argument seems to be on a separate line, which bash will interpret as a second command in the body of your for loop. if that’s happening, CmdStan writes out the results of the second chain to the same default file name
  • you will need to wait in a line after done to ensure Bash does not feel free to exit despite your running background jobs

Not important, but you can write days e.g. --time=14-0 and you could use for i in $(seq $SLURM_CPUS_PER_TASK) to avoid hard coding the num of threads to run.

3 Likes

Now it works, thank you so much!

1 Like