Reimplementing the inference algorithms

Is anyone interested in reimplementing the algorithms code in GitHub - stan-dev/stan: Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.? I’m going to give it a go. I’m going to be as careful as possible, making sure I can reproduce the results exactly with new code.

If you’ve wanted to look at the algorithms in depth or want to see how to sensibly reimplement this type of code, please reach out. (In the absence of anyone wanting to do this, I’ll just work at my own pace, but if there are people that want to do this with me, I’ll take a different approach.)

The repo that I’ll use is here: GitHub - syclik/stan-algorithms: Reimplementation of Stan algorithms
(It has nothing on it yet.)

I’ll manage the work through issues and I’ll keep the discussions about the code on Discourse.

14 Likes

Hey @syclik . I joined the forums. I’d like to work with you on this.

1 Like

I would encourage everyone to think about starting with a design doc.

1 Like

@apoorvreddy, awesome! @dmuck and @hyunji.moon are also on board. =). I’ll add you all to the repo as collaborators.

The written communication on the effort will stay on the forums or the GitHub repo (through issues / PRs) so it’ll be done openly. There’s also a slack channel we can use for more rapid written communication if that’s necessary. And we can always set up Google Meet for video communications if we need.

@mitzimorris, thanks for the reminder! There’s a lot of pre-work to be done prior to a design doc. Right now, the first thing to do is to change the implementation without changing the interface, which wouldn’t need a design doc. When we get to that point, we can definitely write up those docs. Did you want to help in the effort?

any big change to the code base requires a design doc. but it sounds like you’re at a very preliminary stage of development - I understand that ya gotta start somewhere.

1 Like

Thanks @syclik ! How and where do I start contributing !?

1 Like

Thanks, @apoorvreddy, @dmuck, and @hyunji.moon! I’m happy you’re on board.

I’m guessing you’re ok with me laying out what I think we should do; I’d like to turn that around and say that I do like things being collaborations, so we can start that way, but I’d really like your vision, input, and feedback so we can do things to accomplish that.

The first practical milestone I’d like to accomplish is to have a set of C++ tests that instantiate the algorithms:

  • standalone tests. I’d like the tests to have as minimal dependencies as possible. For me, this means that we’ll rely on the stan-dev/stan repo and the stan-dev/math repo, but hopefully won’t need the stan-dev/stanc3 repo.
  • initially test at the API boundary. We’ll use what we expect CmdStan to call. So the boundary should be the stan::services namespace.
  • Enough breadth to have simple cases and more complicated cases. As I’m thinking about what we want to reimplement, I know I’d want tests of single iteration, multiple iterations, warmup only, post-warmup only, and different models.
  • continuous integration. Have this run through GitHub actions if possible.
  • documented. At least have it documented where we can all get to the repo, understand what we’re trying to accomplish, and how to run the tests.
  • (optional) write up how to instantiate at this point. I think getting this far would be valuable for the community to know. But it’s optional. At least we’d have 3 additional people who would be able to instantiate the algorithms.

I think that’s actually going to take us a bit of effort. And just to be clear, at this point, we wouldn’t have done anything other than setup.

From there, I think it’s up to us how to proceed and which algorithms to tackle first. I think the optimization may be the easiest to reimplement, but we may choose to do something else first.

Just so we’re on the same page, I am thinking of first doing a proper “refactoring.” If that’s a new term, please see Code refactoring - Wikipedia. That means we leave the behavior the same and it’s a pure implementation change. We will be tempted to change the behavior, but that’s something different. Taking this approach of separating out the refactoring step from changing the behavior makes it a lot easier to tell that we’re doing the right thing. We have to match the current output exactly in order for us to know that we’ve made sound changes.

I’d like the work on this to be out in the community, so let’s continue to use Discourse. We’ll use GitHub issues as an immediate todo. And if we do have meetings, we can post any important discussions back here.

I’ll dm you to see when you’re free for a video call! (I don’t even know what time zones everyone is in!)

1 Like

have you used Catch for C++ testing? I’m thinking of using it for this effort. It looks a lot more straightforward than google test.

1 Like

Hi @syclik.

If it’s not too late, I would also like to board the train. I’m comfortable with C++ and always wanted to hack into the algorithm portion of stan.

1 Like

@Dashadower, awesome! Welcome!

And for anyone reading this, it’s never going to be too late! If you want to help, come find us.

1 Like

FYI, I’m working on running the existing tests in Stan and having some trouble. I posted a separate thread about the exact issue I’m having here: Stan-dev/stan tests: generating .hpp from .stan files

The first test I’m trying to run (from within the cloned Stan directory):

> ./runTests.py src/test/unit/services/optimize/bfgs_test.cpp

A couple of updates:

  1. For anyone interested in helping with the effort, we’re going to have a kickoff meeting on Wednesday, January 12, at 9 pm eastern via Google Meet. If you’d like to join, please reach out!

  2. Basic setup and tests are ready. In the git repo stan-algorithms, there are two script files at ./setup.sh and ./test.sh. The first script updates all the submodules and downloads the appropriate version of stanc3. The second runs the tests from within the Stan submodule. It’s a good starting point for us to build from.

Amazing! We just had a video call with four of us on the line: @Dashadower, @dmuck, @tal.kedar, and myself. Some super-brief notes from the meeting (please update if you feel like it). These are not in time-order:

  1. Motivation is to rewrite the algorithms code to so that it’s easier to understand. We understand that we’re not going to get many performance gains by doing this.
  2. The documentation in the Stan repo could be improved. It’s very difficult for a new developer to start. We will be improving overall documentation as part of the effort.
  3. We discussed the potential to update the different classes of algorithms: optimization, ADVI, MCMC. Our assessment was that the one that’s used the most by the community and where the biggest impact would be is MCMC. So we’ll start there.
  4. For MCMC, there are on the order of 20 algorithms that we could be implementing. We will try to tackle some of these first. Which ones are yet to be determined.
  5. We want to keep good design where we can swap out parts of the MCMC process.
  6. Next step: @syclik will be looking for the entry point into the algorithms code. This is going to spawn the next bit of work.
  7. We will try to do a “refactor” where things are identical. That said, we may not be able to reproduce results bit-wise. If it is different, we should know when and why it drifts. (If we end up cutting computations or doing things in a different order, that would explain it.)
  8. Outside the scope of this effort: work on performance improvements within the generated C++ class.

After the meeting, I’m feeling pretty energized. I’m glad you’re on board. I feel like we’re all going to learn from each other through this process. If anyone wants to join in, please do.

8 Likes

I have stared at the code for some time. These are my blog posts where I study how the code is organized

They explain how to find the entry point to sampler code and where to find MCMC algorithms.

In your refactor, are you planning to maintain the current class hierarchy and/or namespaces (services and mcmc)?

7 Likes

@jtimonen, awesome writeups! Thanks!

I am going to write up a little more detail about the services layer and the C++ interfaces because that’s where we’re going to start with the reimplementation. (It’s really right between the two layers you described.)

As a refactor step, the interface won’t change. That should mean that the services entry points should remain, so the services namespace will stay.

We haven’t discussed how the next part doesn’t or does change. I imagine it will, but once everyone is caught up with the services layer and how we can verify what’s going on, we can then dig into the next architectural decisions.

Given how complicated the code is and how critical it is to the function of Stan, we really need to be careful. It’s a lot easier to verify a series of smaller changes rather than a large one.


@jtimonen, if you’ve got time and want to help in the effort, please do!!! We’re happy to have the help.

2 Likes

This is a good motivation. If more people understand the code, more people can improve Stan etc.

Yes this was the reason why I wanted to do those blog posts. I had initially very little idea where to look for code that is actually run when I use Stan.

I will be at least following this project but cannot guarentee that I contribute. Maybe we will see after there are concrete things that should be done.

A modular design would be a big contribution to the future of Stan. Dare I say a “plug-in” architecture?

Breck

1 Like

Here’s my attempt at describing the entry point: Stan Algorithms: Where to Start?. We’re starting the work that’s involved… | by Daniel Lee | Medium

Some thoughts:

  • it took a lot longer than I expected
  • I went back and forth on what to focus on. I think it’s not a bad start and it gives enough context, but…
  • if after you read it, it doesn’t give you enough context, please let me know where
  • I don’t think discussion through blogs is a great way to go, but as documentation for a starting point, it’s not the worst place to start

I think the next step from here is to dig into each of the C++ interfaces and what’s coming out of the algorithms. Especially the callbacks.

5 Likes

I see at least three things adjacent to this discussion that relate to activity at Liverpool:

  1. Having an environment that makes it easy to test new algorithms on pdfs defined in Stan files. This boils down to having access to log_prob (and associated gradient) in development environments (eg MATLAB, Python, Julia etc). I think this is relatively easy to achieve, and just means we need to promote functionality that is either already in PyStan/CmdStanPy/MATLABStan/etc or that should be. @mjcarter is probably best placed to comment on our local-to-Liverpool perception of what exists.
  2. Making it easy to prototype desired-by-at-least-one-person functionality (eg Streaming variants of Stan) and algorithms (eg for SMC samplers). We’ve developed SMC-Stan (a version of Stan that uses an SMC sampler in place of NUTS) and Streaming-Stan (a version of Stan that handles Streaming data). We’ve also been looking to add a post-processing component (that implements control variates and re-uses the interface related to #1) and are working up something that can specify pdfs that include both discrete and continuous variables: This work all makes extensive use of Stan’s codebase. While we plan to make these all open-source, the substance of all these things are all (still-sorry!) under development. However, I do sense that the people involved (eg Alessandro Vari, @alphillips, @LJDevlin, @YifanZhou, @mjcarter and @remoore) might have interesting comments on this thread given that there’s been a lot of work to understand how to interface to and capitalise upon Stan’s infrastructure.
  3. Processes for migrating prototypes into future releases of Stan. I think this needs to be distinct from #2 but I worry that the two can and perhaps have, from time to time, become conflated. Perhaps that means that we need to define processes (with a low barrier to entry) related to #2; those processes might simply encourage people to make forks of the github repo, for example, but the existence of the process would highlight that you don’t need to have a design doc to think about how to augment Stan.

Hope that helps.
Simon

9 Likes