Running stan models in parallel in cmdstan py

jbaranowski · August 31, 2021, 8:32am

I’m wondering how to use parallelism when running stan models in cmdstanpy.
I’m doing simulation-based calibration for my model so I need to run fits multiple times.
When i tried using multiprocessing module

I got slower results than when using simple for loop:

Function I’m calling is

def compute_ranks(i):
    result_sbc = sbc_model.sample(data={'N_batch':4,'N':200,'batch':df.batch.values})
    ranks=(np.sum(result_sbc.stan_variable('lt_sim')[np.arange(0, 4000 - 7, 8)],axis=0))
    return ranks

Im not giving model code as I am not sure if relevant in this case.

Operating System: macOs BigSur 11.3/mac Mini M1 16 GB ram
Interface Version: cmdstanpy 0.9.76
Compiler/Toolkit: xcode

WardBrian · August 31, 2021, 1:42pm

I’ll openly admit my python multiprocessing experience is shaky, but my guess is this may be slowed down by sharing memory inside your dataframe.

Just to spitball, what is the result if you do 'batch': df.batch.to_numpy(copy=True)?

jbaranowski · August 31, 2021, 3:00pm

No improvement.
EDIT: Ok, It does not even work as multiprocessing gets immediately stuck.

ahartikainen · August 31, 2021, 4:15pm

Try running in thread pool. (Also how many cpu per chain do you have and how is your ram?)

Topic		Replies	Views
Cmdstanpy: multithreading issues (threads_per_chain) CmdStan cmdstanpy	2	518	December 13, 2023
Multiprocessing and/or multithreading problem - CmdStanPy Modeling cmdstanpy , paralellization	12	105	January 2, 2025
CmdStanPy and multithreading Modeling	10	1271	June 27, 2024
Correct way to use MPI with cmdstanpy Modeling	9	709	August 29, 2020
Running cmdstanr in parallel on computing cluster General	6	1000	December 9, 2022

Running stan models in parallel in cmdstan py

Related topics