Developer sprint

I’d like to start setting up a “developer sprint” for mid-January. I need developers who would be willing to do sessions on various stan development topics. I’m tagging various people on the topics. I think there are more than 1 person who can do each of these but I’m tagging folks who have specific expertise or may be able to point other people. This is NOT volunteering you. You have no obligation to participate. I want to elicit feedback on the proposal. If you are interested and you have the time in mid-to-late Jan (week of jan 23th), please let me know either here, in a private message, on slack, or via email [sean.pinkney [at] gmail.com].

I’m also interested in anyone in the community who would like to contribute to Stan but doesn’t know where to start. Let me know if these topics are of interest to you. Or if there’s something that’s missing from below that you would like to learn about.

  • Stan-math overview. (@rok_cesnovar?)

    • Fwd/rev/mix/prim
    • Testing framework
    • Adding a simple function
  • Stan-math adding a univariate distribution (@andrjohns?)

    • What we have to do
    • Bonus: Adding partials
    • Testing
    • Creating a PR
  • Overview of C++ templates in stan-math (@stevebronder ?)

    • what are templates and why do we use them
    • useful functions like forward, make_holder, etc.
    • When would one create a template, basics of how to do this
  • Our autodiff arena allocator (@stevebronder ?)

    • overview
    • writing scalars, vectors, matrix partials (and why we don’t want huge kronecker product type derivatives)
  • Writing adjoints for multivariate functions/distributions (@Bob_Carpenter? @Seth_Axen? I know Seth that you’re not a stan-dev, but I think you have a lot of experience writing out these complicated adjoints)

    • why adjoints?
    • how to write these?
  • Stanc3 overview (@WardBrian ?)

    • Intro to Ocaml in stanc3
    • how to add stan-math function to stanc3
  • Advanced stanc3 (@WardBrian?)

    • ???
  • Cmdstan (@mitzimorris?)

    • ???
  • Overview of VI internals?

  • Overview of HMC internals?

  • Anything else?

4 Likes

Just to copy my response from Slack, I’d be happy to run a couple sessions on the following ideas:

  1. Intro to OCaml. I see this as a short (~20 minute) live coding demo/pointer to resources. I’d assume the audience is at least familiar with a programming language, e.g. they already can code a little in C/Python/Java/whatever.
  2. Basic Stanc3 contributions. Does not depend on the above, stuff like installing our dependencies and adding a function signature from stan-dev/math.
  3. More advanced Stanc3 contributions. Talking about how we organize our data structures, what we do that is different from “standard” OCaml, what our challenging parts are. Assumes you saw 1 and 2.

I think there is probably some optimal way to organize these so that it’s only two actual sessions/videos

4 Likes

Not sure why my name was removed from the initial post.

I don’t have a ton of availability through the first half of 2023 but I should be able to set a day or two aside, especially if the dates could be settled sooner rather than later. I’d be happy to talk about the history and motivation for the design of the algorithm libraries (or lack thereof in some cases) and the service API. I can also talk about some of the more theoretical aspects of autodiff, including the role of Jacobian-adjoint products and their implementations.

I’m around in January. But I need to be on the receiving side of a math library tutorial. No way I can lead one—I can’t even get small changes to compile any more.

I could talk about how our services layer is designed as I have been working on that.

There’s also the interface level in R and Python. @Jonah in particular could use help on the R interfaces side of things.

1 Like

This would be great. I in particular want to learn how to support 3rd party (high-order) math functions in stanc3 as I haven’t been able to follow recent activities.

I would be happy to talk about CmdStan and the challenges of wrapping it for CmdStanPy. The latter includes the challenge of trying to keep Python and R interfaces as similar as possible.

@spinkney, I’m happy to discuss whatever is useful.

Maybe this would be helpful?

  • Basic Stan Math toolchain and how to get started.
    • Make, C++ compiler, doxygen, python
    • How to run a single unit test
      • Run the python script which then calls make
      • Walk through the make target and what it actually builds. And how it calls the C++ compiler
      • How the google test executable is triggered and runs
      • What the test looks like and how to write a test
1 Like

@spinkney has there been any further movement on this? (e.g. scheduling)

Post-processing and further interfaces to support online simulators + visualization would be helpful (e.g. inferencedata as a platform for joint-distribution-based diagnostics and different languages) and I think @ahartikainen is the best person to introduce this. If needed, I would also be interested in helping its preparation.

I think this is outside of the intended scope - inferencedata is an ArviZ concept.

1 Like

I wasn’t sure what “the intended scope” is from the original post, but as approximation algorithm design (e.g. VI-diagnostics) heavily depends on diagnostics and visualization, I thought a system supporting this feedback loop (iterative model building) might be helpful for people wishing to develop stan (at least for me :)). inferencedata was just one example.

I’m really interested in getting developers to bootstrap their knowledge to help new people contribute to the Stan project. My initial thoughts were around the main Stan project of stan-math, stanc3, Stan, and cmdstan.

Although @mitzimorris correctly inferred that most visualizations are outside this scope, diagnostics and visual diagnostics are a core part of Stan and the historical development.

@betanalpha is the historical expert on the development of these diagnostics (as he mentioned above). He’s been tirelessly promoting Bayesian workflows and diagnostic checks. I believe he has a recent case study (or it’s in development) about his current work on Bayesian diagnostics. I think a talk about how to develop or think about developing these diagnostics for Stan plus the historical development are in scope.

2 Likes

Yes, I’m now targeting a late Feb or early March timeframe.

I’m thinking the week of Feb. 27. I’ll reach out to the different folks who said they would do a talk/workshop to get dates and times. I don’t think it makes sense to do an entire day of these and, instead, will distribute the talks across the week or two weeks if necessary. I’m thinking of making this low-key with zoom links. I’m concerned about having too many people for a viable workshop like talk but prior dev sprints from NumFocus projects have said these weren’t widely attended. The main benefit that the projects noted was having this material to onboard devs in the future. They all said, despite the lack of new devs coming onboard at the sprint, it was worth it to onboard devs over the following year. We’ll record these sessions and have the presentations available after. Plus this should spur updated documentation and help devs craft better documentation given the questions that arise during the sprint.

@mitzimorris could you help me contact different devs to find dates and times that would work for them?

2 Likes