Cmdstanpy ValueError with multiprocess

mitzimorris · December 3, 2021, 2:13pm

hi Alex,

I looked at your code, and if you supply a unique chain_id to your predict function that gets passed along to your call CmdStanPy sample method here: tablespoon/forecasters.py at 61d23c0f44e418e9b840a9356be833f2f992bfa4 · alexhallam/tablespoon · GitHub,
then the resulting CSV filenames will be unique.

if this helps, below is something I wrote a long time ago when I was playing around with a cluster running SLURM. this is a proof-of-concept demo of how to compile a Stan program once on the head node and then run 100 chains on the cluster, saving all CSV output files in a shared directory. the key is that each run has a unique chain id, thus the output files don’t get clobbered.

slurm script - assumes that compiled model exe file exists

# distribute N chains, M chains per node on the cluster

# set up all slurm directives
# common args:
#  - cmdstan_path, model_exe, seed, output_dir, data_path
# unique arg:  chain_id  - jobs array number:  %a

#!/bin/bash
#SBATCH --job-name=cmdstanpy_runs
#SBATCH --output=cmdstanpy_stdout-%j-%a.out
#SBATCH --error=cmdstanpu_stderr-%j-%a.err
#SBATCH --nodes=20
#SBATCH --cpus-per-task=1
#SBATCH -a 0-100
python run_chain cmdstan_path model_exe seed chain_id output_dir data_path

program run_chain.py

# User CmdStanPy to run one chain
# Required args:
# - cmdstanpath
# - model_exe
# - seed
# - chain_id
# - output_dir
# - data_file

import os
import sys

from cmdstanpy.model import CmdStanModel, set_cmdstan_path, cmdstan_path

useage = """\
run_chain.py <cmdstan_path> <model_exe> <seed> <chain_id> <output_dir> (<data_path>)\
"""

def main():
    if (len(sys.argv) < 5):
        print(missing arguments)
        print(useage)
        sys.exit(1)
    a_cmdstan_path = sys.argv[1]
    a_model_exe = sys.argv[2]
    a_seed = int(sys.argv[3])
    a_chain_id = int(sys.argv[4])
    a_output_dir = sys.argv[5]
    a_data_file = None
    if (len(sys.argv) > 6):
        a_data_file = sys.argv[6]

    set_cmdstan_path(a_cmdstan_path)
    print(cmdstan_path())

    mod = CmdStanModel(exe_file=a_model_exe)
    mod.sample(chains=1, chain_ids=a_chain_id, seed=a_seed, output_dir=a_output_dir, data_file=a_data_file)

if __name__ == "__main__":
    main()

Topic		Replies	Views
CmdStanPy "csv file header mismatch" CmdStan	4	611	June 10, 2021
Running stan models in parallel in cmdstan py Interfaces cmdstanpy	3	502	August 31, 2021
Getting runtimeerror: ERROR when running cmdstanpy tutorial on google colab Other cmdstanpy	1	738	April 26, 2021
Cmdstanpy and pandas dataframe Developers python	13	1438	March 27, 2021
Summary error with cmdstan [number of columns in sample does not match chains] Interfaces cmdstan	10	651	June 29, 2021

Cmdstanpy ValueError with multiprocess

Related topics