Monthly language dev meeting

The monthly language dev meeting will be held on the

  • second Thursday of the month, 10 am New York time

That’s 1 hour before the regular Stan meeting, and concretely, means meetings on

  • Thursday, 9 September, 2021
  • Thursday, 7 October, 2021 [first Thursday of the month due to conflicts]
  • Thursday, 11 November, 2021
  • Thursday, 9 December, 2021

I propose we have a first meeting to discuss something like a roadmap of to-do items on:

  • Thursday, 9 September, 10 am New York time

Please let me know if you want me to put you on the Google Calendar invite list for the meeting.

I want to make sure this time is OK with everyone who has been contributing to the language and wants to attend. That includes at least @nhuurre, @rok_cesnovar, @stevebronder, @WardBrian, @rybern, @seantalts, @mitzimorris, @mgorinova. Please add anyone I’ve missed who you think might be interested.

We’ll make sure not to conflict with the math library meetings that @syclik runs in the same time slot or the general Stan meetings that @andrewgelman runs in the following time slot.

11 Likes

sounds good. please put me on the invite.

I am in.

I would love to attend

I’m in as well!

Interested

I am also interested

Also interested

Email for google calendar: thestoropoli@gmail.com

I would also be happy to join.

2 Likes

Great idea, I’d like to join

Just confirming we’ll be doing this tomorrow, 9 September at 10 am. Everyone who responded here saying they were interested should have a Google calendar invite.

I’d like to focus on the following two questions during the meeting.

Where are we?

I’d like to start with a survey of issues that people think are important to address in the short term, either because they’re bugs or blocking something else. This can be anything from the way the code is organized, how function argument testing works, to how it interacts with C++.

What I want out of this is a prioritized list of things that need to be fixed or improved.

Where are we going?

I’d like to collect a list of the big features we’d like to see going forward. By that I mean ones big enough that there should be a design doc.

I can go over my list below, but I’m really looking to hear from everyone about prioritization and other features.

  • closures
  • comprehensions
  • user-defined gradients
  • user-defined function overloading
  • tuples and structs
  • ragged arrays
  • sparse vectors and matrices
  • user-defined parallelization
  • plug-in external function calls
  • variadic functions like generalized sum()
  • inverse cdfs and inverse ccdfs
  • complex vector and matrix types

These I’d like to turn into some kind of roadmap.

4 Likes

Let me know if you’d like me to put you on the calendar invite for the zoom meetings going forward. The plan is to meet the second Thursday of every month at 10 am NYC time (the hour before the general Stan meeting). The math lib meeting is the third Thursday of every month in the same time slot.

I know @Niko has also said he’d like to get invites, so I’m just recording that here to remind myself.

I’ll be producing a summary of today’s meeting and announcing that here soon. Thanks to everyone who was able to join.

5 Likes

I can’t make our next scheduled language meeting (14 October) due to an all-day local meeting I have to attend for work. I suspect that’ll also take @WardBrian out.

So my question is whether you’d like to

a. have the meeting without me,
b. reschedule to a different day, or
c. skip this month.

The third Thursdays of each month are the math dev meeting, but we could meet on 7 October if that works for people.

@WardBrian just reminded me I never posted the notes from last meeting. Sorry about that. Feel free to edit or send me edits if I got things wrong.

Minutes for 9 September 2021

Status reports

Tuples

A high % of this project is done, but @rybern reports some residual issues:

  • nested containers with constraints
  • code gen is complicated
  • need to work out code for new matrix type
  • TO-DO: rewrite code gen (mostly done)
  • TO-DO: parser warning messages

Nested constraints

@WardBrian is working on a spec. There are lots of issues around offset/multiplier not being constraining and how much flexibility we want to allow in composing transforms. So far the only ones we can think of that are reasonable are offset/multiplier (applies first when unconstraining) and lower/upper bounds. And maybe ordered and positive_ordered types.

There’s an issue of whether to work toward user-defined constraints and Jacobian adjustments or whether to just keep our built-ins.

Closures

@rok_cesnovar reported these are almost done in both stanc3 and with the C++ code gen. This seems to be blocked in code review. There are a few deviations from the design doc that need to be included in the doc or at least listed before we write the reference manual and user’s guide doc. It needs larger-scale testing on both eh math library side (@nhuurre has a PR) and in stanc3 (ditto).

OpenCL optimizations

@rok_cesnovar reports these are just waiting on the varmint work to land in the math lib. It then needs a guide on how to add a simple signature.

Varmat

This is a huge efficiency gain from representing matrices as a matrix of values and a matrix of adjoints. It’s largely taking place in the math lib, but the language needs to figure out where it’s safe to use a varmat. @stevebronder has a work-in-progress PR for Stanc3 on this. He says it’s taking a very conservative, safety-first approach. This is running into the precompiled headers (PCH) issues during testing. @rok_cesnovar says all the files are tested, but it’s just Stan without CmdStan running end to end.

General comments

@avehtari suggested we should be more organized around reporting progress so users know which language features are coming. I really wish we had features coming out fast enough this was more of a problem! The issue came up of where to do that: (a) stanc3 wiki, (b) readme.md for stanc3, or (c) GitHub projects.

Documentation issue arising from RStan and CmdStan being at different versions. We need a way for users to easily access doc for the version they’re using and going forward, each new feature should be listed with the version it was introduced so everyone could read the most recent doc. This probably won’t be too much work to maintain, but it’d be a huge amount of work to do the first time.

@WardBrian brought up deprecation schedule and we had a long discussion around how it’ll make maintenance easier, how we can convert old code with the pretty printer, and how we should have a concrete schedule. Since the meeting, he’s turned this into a design doc that’s ready for comments.

@rybern pointed out that the main issue with overloading and supporting variadic functions is technical debt in the type checker, which handles user-defined and system functions differently. @WardBrian has been looking into that. The solution will probably involve more type variables and less explicit instantiation.

What is the syntax for structs going to look like? Perhaps this discussion is premature until we have tuples implemented. The implementation should just follow tuples. We probably don’t want ways to easily partially instantiate structs, though that discussion is ongoing.

How do we plug in external functions? @rok_cesnovar has a Stanc3 issue (#712) for this.

We generally preferred _qf to _icdf for inverse cdfs (aka quantile functions).

3 Likes

Thanks for sharing the minutes!

My 5 cents is that Stan is already flexible enough to let you code any necessary constraints - and nesting constraints in the order “built-in first, custom second” is also possible (if I understand the meaning of “nested constraints” correctly). So allowing “custom first, built-in second” - while definitely helpful in some cases - won’t IMHO bring a very big benefit. IMHO the only thing that’s not possible in the current version is to have a custom transform not apply the Jacobian correction when running optimizing (as you’ve already proposed some time ago: e.g., here and here and as I discussed here). So just wanted to +1 the idea while you are discussing constraints. It’s mildly related to some work I try to do with SBC for approximate algorithms, but it is not particularly blocking me on any specific project (and when truly needed, one can pass a boolean is_optimizing as data, it’s just not very elegant).

Thanks once again for all the work going into the language!

I vote to move the meeting up a week. I’d like to discuss the future of Stan2TFP and other non-C++ backends and how we’d like them to live/coexist in the stanc3 project

Right. It’d be possible to code all of our constraints by hand using only the real type. So the question isn’t about adding expressiveness, but adding convenience. I like to ask myself if programs would be easier to write or easier to read with a feature. User-defined functions is a good example feature for comparison. They literally don’t add any expressiveness to the language (even mutual recursion can be unrolled into an iterative loop that gets inlined in place of any function call).

Right now, user-defined transforms get spread out over the parameters, transformed parameters (the actual transform) and model block (the Jacobian adjustment). So I’m wondering whether there’d be a way to encapsulate all of that in a way that’d be more understandable. For instance, compare this:

functions {
  vector my_simplex_constrain(vector x) { ... }
  real my_simplex_constrain_jacobian(vector x, vector y) { ... }
  vector my_simplex_unconstrain(vector y) { ... }
}
...
parameters {
  my_simplex[K] x;
}

vs. something like this with the same functions:

parameters {
  vector[K - 1] x_raw;
...
transformed parameters {
  vector[K] x = my_simplex_constrain(x_raw);
...
model {
  target += my_simplex_jacobian(x_raw, x);
}

There’s also the issue that this won’t work with initialization properly (where we use unconstrain to map user-supplied values to the unconstrained scale)—that’ll all have to be done by hand on the outside.

P.S. Rather than get dinged for more inappropriate behavior, I’ll leave it to you to decide where this kind of discussion goes on the forums. I was just trying to report the minutes of the meetings here, not open a larger discussion on language features. So feel free to move this or even delete it.

2 Likes

OK, let’s move the meeting up a week, as I’d like to be able to attend, too, and nobody else responded. So the next meeting will be:

  • 7 October 2021, 10 am NY Time (1 hr before the Stan meeting)

I’ll revise the top post to match.

2 Likes

I agree that convenience is important. The intro I gave was probably a distraction and I should have said something more like “Hey, while you are discussing constraints, could you also consider this? No pressure, but it would IMHO kind of help.” and just avoid the rest. Agree that meeting thread is usually not a good place for long discussions about design and those deserve their own topics. I think the meeting thread could be a good place for people watching from the sidelines to provide brief feedback/wishes/thanks/… as I don’t think those would work well elsewhere. But I think that the meeting organizers are free to setup a mechanism that works best for them for getting feedback from the community, so its mostly up to you.

Thanks again for running the meetings.

If anyone would like to get the Google invite for the meeting, please let me know. I’ll set up a standing invite for the second Tuesday of the month starting next month.

Please also let me know if there are agenda items for this meeting.

1 Like