Can you specify Multiple Threads with Optimize (reduce_sum)?

I have stan code with reduce_sum where the multi threading works great with sampling. The optimize method works with the same code, but it reports num_threads = 1. I am using cmdstanr, and I do not see any parameter to specify the number of threads. Can the optimize method use multiple threads? And if so, how do you specify that (preferably in cmdstanr)?

Hi @klattery welcome to the Stan forums! In theory the treading should work with optimization but unfortunately we just haven’t implemented it in the cmdstanr interface yet. This is definitely on our to-do list for cmdstanr, which is still quite new so some things like this are missing:

I think it should work if you use cmdstan directly without the cmdstanr wrapper but that’s much less convenient. Hopefully we can add the necessary code to cmdstanr soon!

You can actually use it, but you have to set the environment variable yourself.

So

Sys.setenv(“STAN_NUM_THREADS”=X)
mod$optimize()

And it should work provided the model was compiled with stan_threads.

This will be nicer once we close that issue as @jonah says.

1 Like

Good point @rok_cesnovar!

YES!!! You made my day.

I can’t thank you guys at Stan enough for adding multi-threading this year. I have a 32 core AMD threadripper and the difference is amazing, both for sampling and now optimization (which I run for quick model testing). With 1 thread the optimization took 2 hours. I just did in 10 minutes (same exact computer and code/data). Likewise with sampling, models that took 3 days can now be estimated in 8 hours.

Just to be clear, here is the R code calling compiling and calling Stan optimize (I’m running cmdstanr on WSL).
HB_model <- cmdstan_model(file.path(dir_model, “LogDiff_SUR2.2.stan”), quiet = TRUE, cpp_options = list(stan_threads = TRUE))
Sys.setenv(“STAN_NUM_THREADS” = 30)
HB_MLE <- HB_model$optimize(modifyList(data_list, data_model), init = .5, seed = 2718,
refresh = 5, iter = 1000)

1 Like

@klattery That’s awesome! Really glad to hear you’ve been able to take advantage of the multithreading.

Just following up here to say that I just merged @rok_cesnovar’s PR to add support for threading for optimization and variational inference in cmdstanr. So you can now specify the number of threads to use via the threads argument to the $optimize() method instead of manually specifying the STAN_NUM_THREADS environment variable:

1 Like

I assume this is all analogous in the cmdstanpy universe? That is, if I compile a reduce_sum model with STAN_THREADS = True and run optimize with os.environ['STAN_NUM_THREADS'] = str(8) or something, I will get a parallel run of MLE?

I wonder if this is a good way to tune grainsize before running full MCMC?

bumping this. @mitzimorris @WardBrian Do either of you know if this works with cmdstanpy optimize? Do I need to set this globally?

Setting the environment variable should work. We don’t expose anything for optimization that sets it

1 Like

Okay I will try this. If I remember correctly, this didn’t work (setting the environment variable).

I will check again. I can also review the PR to cmdstanr to see how you called the cmdstan optimize_args API (I think the argument is threads?). and see if I can’t do something similar with a local copy of cmdstanpy

Confirmed that this “works” (in that setting show_console=True in the optimize call for cmdstanpy leads to a line of

num_threads = 8 (Default)

where I set
os.environ['STAN_NUM_THREADS'] = str(8)

1 Like