Pooled warmup

wds15 · November 7, 2019, 6:45pm

@stevebronder asked me to share my current prototype on parallel warmup … maybe this info here is useful for others as well (@bbbales2, @Bob_Carpenter, @yizhang, @betanalpha)

I just pushed to cmdstan and stan repos my pooling warmup prototype.

Right now I am abusing the STAN_NUM_THREADS variable to specify the number of chains which are being run. Nothing runs in parallel though yet. I figured thats not yet necessary to do so as I would first like to learn how we should pool the adaptation info.

At the moment I am just pooling the covariances which each chain learns in each window. The stan services framework isn’t really setup to share information between chains such that I had to trick a bit to get the window information.

I got side-tracked with rstan emergency which is why I was a bit silent on this front.

To me the next steps would be: Figure out a benchmark; possibly along the lines I outlined on discourse. Then we tweak the warmup pooling to the point where this is beneficial => so the total numerical effort must be lower when we use the pooling the info - otherwise we won’t see any speedups as going parallel means to add some friction inevitably.

Then we can go crazy on specs / design docs and finally we implement it.

Let me know what you think.

Sebastian

betanalpha · November 12, 2019, 4:44am

What are the branches?

As I was saying before I think it will be more productive to work out the changes to the services first before worrying about specific adaptation strategies (or parallelizaiton technologies, for that matter).

wds15 · November 12, 2019, 8:07am

I never worked on stan services. It took me a while to figure out the logic. Without a prototype I would not be able to make sensible designs.

Bob_Carpenter · November 27, 2019, 7:30pm

You mean for things like specifying number of chains? Right now, each service call is an independent chain. I think the main change we’ll need from the services is a way to deal with multiple chains of output. The input and config should be straightforward assuming it’s shared among the chains.

betanalpha · December 3, 2019, 2:59pm

Correct – see also the progression I suggested in the other thread,

In addition to setting up new adaptation strategies 1 and 2 alone would allow the interfaces to simplify quite a bit and provide a more coherent user experience.

wds15 · December 3, 2019, 6:44pm

Sure…but 1 and 2 are already implemented by all of our interfaces. It’s obviously a win to do it consistent, but it is no win in terms of new functionality.

stevebronder · December 3, 2019, 7:43pm

Don’t we need to do (1) and (2) at the C++ level with tbb to also imbed (3) and (4)?

wds15 · December 3, 2019, 8:43pm

We do need them, yes…these are just a lot of work without too much net benefit…but we need them, of course.

Topic		Replies	Views
Lame version of shared warmup via threading General	7	414	October 15, 2020
New adaptive warmup proposal (looking for feedback)! Algorithms	50	4173	July 31, 2020
Stan 2.20.0 released! Announcements	3	1518	July 19, 2019
Any way to speed up warmup? General performance	5	1950	July 18, 2020
Parallel ODE first steps roadmap Developers features	15	1537	February 20, 2017

Pooled warmup

Related topics