hi Alex,
I looked at your code, and if you supply a unique chain_id to your predict
function that gets passed along to your call CmdStanPy sample
method here: tablespoon/forecasters.py at 61d23c0f44e418e9b840a9356be833f2f992bfa4 · alexhallam/tablespoon · GitHub,
then the resulting CSV filenames will be unique.
if this helps, below is something I wrote a long time ago when I was playing around with a cluster running SLURM. this is a proof-of-concept demo of how to compile a Stan program once on the head node and then run 100 chains on the cluster, saving all CSV output files in a shared directory. the key is that each run has a unique chain id, thus the output files don’t get clobbered.
slurm script - assumes that compiled model exe file exists
# distribute N chains, M chains per node on the cluster
# set up all slurm directives
# common args:
# - cmdstan_path, model_exe, seed, output_dir, data_path
# unique arg: chain_id - jobs array number: %a
#!/bin/bash
#SBATCH --job-name=cmdstanpy_runs
#SBATCH --output=cmdstanpy_stdout-%j-%a.out
#SBATCH --error=cmdstanpu_stderr-%j-%a.err
#SBATCH --nodes=20
#SBATCH --cpus-per-task=1
#SBATCH -a 0-100
python run_chain cmdstan_path model_exe seed chain_id output_dir data_path
program run_chain.py
# User CmdStanPy to run one chain
# Required args:
# - cmdstanpath
# - model_exe
# - seed
# - chain_id
# - output_dir
# - data_file
import os
import sys
from cmdstanpy.model import CmdStanModel, set_cmdstan_path, cmdstan_path
useage = """\
run_chain.py <cmdstan_path> <model_exe> <seed> <chain_id> <output_dir> (<data_path>)\
"""
def main():
if (len(sys.argv) < 5):
print(missing arguments)
print(useage)
sys.exit(1)
a_cmdstan_path = sys.argv[1]
a_model_exe = sys.argv[2]
a_seed = int(sys.argv[3])
a_chain_id = int(sys.argv[4])
a_output_dir = sys.argv[5]
a_data_file = None
if (len(sys.argv) > 6):
a_data_file = sys.argv[6]
set_cmdstan_path(a_cmdstan_path)
print(cmdstan_path())
mod = CmdStanModel(exe_file=a_model_exe)
mod.sample(chains=1, chain_ids=a_chain_id, seed=a_seed, output_dir=a_output_dir, data_file=a_data_file)
if __name__ == "__main__":
main()