L-fbgs : parallel implementation

Hello everyone,

I’m currently working on speeding up the training phase of Facebook Prophet, and I’m focusing on its dependency on Stan. I recently came across a paper published in November 2020, which introduces the first genuinely parallel version of the ‘L-BFGS’ algorithm.

Since reading this paper, I’ve been curious about the possibility of replacing Stan’s ‘L-BFGS’ with this parallel version to enhance the speed of fitting Prophet. If this substitution would indeed accelerate the process, could someone kindly explain the steps I should take to get started?

Thank you!
Adly

Aforementioned Paper:
title: Asynchronous Parallel Stochastic Quasi-Newton Methods
authors: Qianqian Tong, Guannan Liang, Xingyu Cai, Chunjiang Zhu, Jinbo Bi
url: https://arxiv.org/pdf/2011.00667.pdf

Hi, @adly and welcome to the Stan forums.

Stan’s L-BFGS is not stochastic and L-BFGS tends not to work so well with small batches, where it struggles to estimate a Hessian. So I’m not entirely sure how relevant this is.

To answer your question, the process would be to implement the algorithm outside of Stan, profile it, and then motivate putting it in Stan. We work through pull requests, though something like this, especially if the change was to some kind of stochastic gradient, would have to start with a design doc.

You might find it easier pulling the Prophet model out of Stan and reimplementing it in JAX or PyTorch. It’s not that complicated and NumPyro might make it easy in JAX.

3 Likes

Hi Bob,

Thanks for your valuable insights regarding L-BFGS! I hadn’t considered the limitations with small batches, and it makes perfect sense why Stan’s implementation might not be ideal for this specific scenario.

Following your suggestion, I’m currently focusing on implementing the default Prophet model outside of Stan. This will allow me to test and compare the performance of various optimizers on the model’s posterior mode.

I’ll also look into the possibility of using frameworks like JAX or PyTorch for re-implementing the model. NumPyro with JAX sounds like an interesting option, so I’ll definitely explore that avenue as well.

Thanks again for your guidance!

Adly

Someone had previously opened a PR to the Prophet project with an implementation in NumPyro, you may want to try writing to them Adding numpyro backend by freddyaboulton · Pull Request #1973 · facebook/prophet · GitHub