Hi,
I have probably a very naïve question… When we use pathfinder and there are as many CPUs availables as num_paths
, does each pathfinder run in parallel? How should num_threads
be set to optimize speed?
Sorry for the simple question and thank you in advance.
If you have enough cores available on your system such that each thread can run simultaneously, and num_threads is at least num_paths (more if you also want to use something like reduce_sum, which also uses threads), all paths will run in parallel.
If this isn’t true, some will wait for others to finish. For example, if you run num_paths=4 num_threads=2 and observe the output, you will see two paths run to completion before the next two paths start.
2 Likes
Thank you very much for your answer, it’s very clear!
Just one more question: is this reasoning valid for the HMC sampler as well? with num_chains
in place of num_paths