Hi all,
I’ve been trying to run a stan model using cmdstanpy (it’s my first time using cmdstanpy) with 4 chains of 16 cores, I want my chains to be executed in parallel (cpp_options={'STAN_THREADS': 'TRUE'}
is specified and this works). Ideally, I want each chain to use 4 cores, but when I specify threads_per_chain=4' in sample() and run my code, it only uses 4 cores (checked in activity monitor (Mac) and with htop (Linux)) when I would want it to use 16 cores. I have tried specifying
os.environ[“STAN_NUM_THREADS”] = str(16)` but this does not seem to have any effect.
currently my code looks like this
nc=4
threads=4
model = CmdStanModel(stan_file=stan_code, cpp_options={'STAN_THREADS': 'TRUE', force_compile=True)
fit = model.sample(data=d, threads_per_chain=threads, chains=nc, iter_warmup=nw,
iter_sampling=ni, seed=305, thin=thin,)
- Operating System: macOS venture 13.5.1 and Linux 4.18.0-477.27.1.el8_8.x86_64 x86_64
- CmdStan Version: cmdstanpy= 1.2.0, cmdstan=2.31.0
Help would be appreciated