Stan Governance

The scope of governance I am proposing has drifted from my original intent. I’ll try to state it clearly.

  1. We need a general decision making body for Stan independent of Columbia, NumFOCUS and various companies.

  2. This group is for all Stan decision making.

  3. On many successful open source projects the group consist of committers + valuable individuals.

  4. There are some standard guidelines for inclusion/exclusion in the group but I suggest that the group decide how it maintains itself and makes decisions.

This discussion has made it clear that there is concern that the set of committers is too big.

Options:

a) Michael’s proposal is about how we manage committers/pull requests/branching. My proposal is about how we run Stan. His proposal is a viable alternative if the scope is upped to running Stan.

b) My original proposed electorate of committers, perhaps culled a bit by some unspecified process + Andrew etc…

c) !New and Improved Electorate!: A set of volunteers that nominate themselves as the seed electorate. A variation would be giving Andrew veto power over nominees which would be his first and last act as the benevolent dictator.

Breck:

I’m not sure what such a body would decide. Do you have any examples in mind?

We don’t have the authority to give such a body precedence over Columbia or NumFOCUS when it comes to hiring and/or funds in those organizations. And those are the only two channels of funds we have at this point that are in any sense related to the project rather than to individuals. Similarly, we can’t give such a body precedence over individuals who get contacted about teaching or contracting gigs.

It might help to think of Stan as a more loosely confederated set of projects at this point.

I think the biggest and most impactful decisions we have to make are how to allocate employee resources, but again, this isn’t something this body can really do as the employees all work for someone else.

For the project, the big decisions are at the low levels (what functionality to add to language and math lib and/or algorithms and what to remove) and at the user-facing interface levels. Those are the things that affect the most people when they change because other things tie into them (Stan programs tie into the language, interfaces tie into the math and algorithms lib, and everyting users do ties into either the math lib or interfaces). Projects like loo, ShinyStan, BayesPlot, etc. seem a little more flexible, though I can imagine BayesPlot also getting locked in through the higher-level inerfaces.

Probably the most contentious decisions haven’t been about what to add but the design of the components. This is still pretty low-level from the organization perspective but should be included here.

Hey, just a moment! Github says I have reviewed 3 pull requests in 3 repositories under stan-dev in 2017 :)

Aki

4 Likes

This is a good one – or we can crib from many of the other big open source projects that have recently been adopting codes of conduct. In my opinion the most important point is that we make it clear that the entire open source project falls under some code of conduct and that we are trying to ensure a robust community (so that people will hopefully feel more welcome and willing to let us know if they are not).

2 Likes

Good question:

  1. Approve code of conduct. We have no mechanism for approving such a thing.
  2. Manage the committers assuming that is the electorate, more important for removal of someone.
  3. Assuming electorate is committers, then manage pull requests, features, architecture decisions.
  4. Make decisions to defend Stan trademark.
  5. Decisions around Stancons.

Those are a few.

1 and 4 are ok, 2 and 3 we want to manage by project, and we’re running this StanCon through NUMfocus, though there is a decision on how to do this going forward.

Isn’t the trademark currently administered through numfocus? (Not saying that must always be the case, but I’m pretty sure that’s the case now)

Practically, Stancon needs to be run by a body with a credit card, so either numfocus or columbia makes more sense to me

The copyright is held by NumFOCUS under the control of the Stan leadership body. We could have the leadership body vote via the decision of some other body, but then we’re playing musical chairs.

Columbia would be of no help in organizing a conference, but NumFOCUS is very helpful not only for managing finances but also little annoying things like insurance coverage.

The operative question here is what exactly do we want to open up and make more inclusive to help sustain and grow the project? In my opinion logo, conferences, general direction of the project, and even codes of conduct would not fare well under complete democracy and are more applicable to the BDFL model, or something towards that. What I think we really want to open up is contributions to the code base, both in terms of PRs and reviews, which is one of the reasons why I suggested a proposal focusing on governance solely for those responsibilities, leaving the rest up to the Stan cabal that we can work on opening up later if we deem necessary.

I have not considered seriously enough the “no formal leadership for Stan” option but clearly there is support for that.

My opinion is that we need formal leadership to set the direction of the project and it is easier to setup now than when we will inevitably need it in the future. Adopting a code of conduct is such a decision. An possible example being an intellectual property disagreement will need formal leadership for response. There is also a history of decisions dragging on.

A formal leadership body is also an important step for Stan to exist independent of Columbia in my opinion. If we do nothing then Columbia and ultimately Andrew is in charge by default. I don’t think Andrew is particularly interested in being the boss. We can stay this way if that is what people want though. Also, I believe that Stan should be independent because it increases likelihood of success as more non-Columbia people get involved. Columbia + contributors is less compelling an open source story than Stan on it own with lots of Columbia support.

So, do we want formal leadership or not? This is not about managing the details of the code base.

Some things are under the direction of the Stan NumFocus board (e.g.-taking care of the trademark, dealing with conference finances, code of conduct) and that’s fine but the decisions and rationale should at least be recorded somewhere. This doesn’t need to be onerous but it’s not going to work if the board decides stuff and only board members and people who talk to them know. At least it’s not going to work for me----it’s not transparent and that’s irritating.

I agree but this isn’t going to get anywhere until we agree what issues the leadership body would cover. So for example according to Michael trademark stuff and choosing the code of conduct is the board (here they are: http://mc-stan.org/about/numfocus/index.html). What about these:

  • Somebody violates the code of conduct, who deals with it?
  • We need guidelines for when Dr. X wants to add a newfangled algorithm to Stan for bragging rights, who write that? Who enforces it?

Could we do a wiki where we divide this stuff up? It’s not just managing code details but I can see how some things get careved out for the Stan/NumFocus board.

@breckbaldwin: I wasn’t suggesting no top-level governance, just that the scope shouldn’t extend down to design of features in Stan.

@sakrejda: You bring up a good point about division of labor.

The NumFOCUS leadership body for Stan is charged with voting on funds dispersement out of NumFOCUS. Now the confusing thing is that people donating to the project don’t necessarily care about NumFOCUS, they just want to donate to the project.

The trademark for the name and logo are held by NumFOCUS.

As far as the conference goes, Michael convened a committee (after asking for volunteers) and they’ve done all the work without much outside input. The remaining money after last year’s StanCon was contributed to NumFOCUS and NumFOCUS is acting as the financial guarantor for Stan for the conference.

I’ve been apprehensive about these codes of conduct for exactly the reason you bring up—it’s not clear what the enforcement mechanism is. I don’t think code of conduct for the project should be a NumFOCUS issue.

I don’t know how we could create general guidelines for adding things. As we discussed in the meeting today, there’s no set of criteria that if you tick off, you get your code in Stan. Everything’s complicated enough that we have to make decisions on a case-by-case basis. The proposals floating around now are that we designate committers (my preference is to do this by repo, not overall) and let them vote on issues of what gets in or doesn’t in that particular piece. As to who writes things, that depends. We had this go round with both ADVI and L-BFGS, where Daniel wound up doing a lot of backfill on testing and code refactoring before we’d merge them. In both those cases, we decided the effort on our part would be worth it. The problem we have with assigning work to people is that the 7 people at Columbia all technically work for Andrew (or so he tells me). That gives him the final word on what we do. In practice, it’s a free for all, and it’s pretty much impossible to make any of us do something.

Sure, but the case-by-case decisions are simpler if we agree on guidelines. We’ve laid them out verbally, it wouldn’t take much more to write them down and, for example, point people to the test suite of models we have. It wasn’t so long ago that densities were “write it and it’s in”, basically. I think that ground to a halt with the half-maintained brownian bridge density where we had a good example that there’s a serious cost. I wrote the bullet points of the guidelines we talked about today (for densities) in the Wiki: Introduction to Stan for New Developers · stan-dev/stan Wiki · GitHub

As before I agree with this. I think only stan-dev/stan and stan-dev/math are going to have a bigger group involved on a regular basis.

I would leave that in the past. L-BFGS seems like it was a good choice and ADVI wasn’t. Now with guidelines and by-repo consensus/voting I think we could make better choices. What will help is having a process and having guidelines we can refer people to. If somebody comes along with an algorithm so good we want to write it into Stan for them that’s fine. I bet that choice is going to be driven by things like who funds your time and how much you like the algorithm and that seems appropriate.

[quote] The problem we have with assigning work to people is that the 7 people at Columbia all technically work for Andrew (or so he tells me). That gives him the final word on what we do. In practice, it’s a free for all, and it’s pretty much impossible to make any of us do something.
[/quote]

Sure but lots of people contribute to open source projects as part of a job funded through something else. Stan is going to be less attractive to outside contributors if there’s one set of rules for Andrew’s algorithms and another stricter set for everybody else’s (for example). Having hear him (and others from Columbia) talk about it I don’t really think this is going to be an issue.

Nothing was ever “write it and it’s in”. We just didn’t have many developers and they were exercising reasonable judgement.

Guidelines should be to first run a spec by the voters for the repo or project before putting in a ton of work that might get rejected. After the decision’s been made that a feature would be good to have, the rest is just code quality concerns.

I don’t think any guidelines in the world would’ve changed our decisions about L-BFGS and ADVI given the information we had at the time. It’s often only in retrospect that these decisions become clearer. Andrew thinks he’s making headway on fixing ADVI.

Agreed on the last point. Andrew can tell us what to do on our jobs, but is in no way in charge of what goes into Stan. The thing I worry about more is that those of us at Columbia have much more chance to have off-list discussions in person. What I don’t want to do is collude at length, then drop something on everyone else without input. Again, I’m hoping this just won’t be a problem.

I agree with this. Being NumFOCUS affiliated means that we need a code of conduct (see here). Any decent COC will have to have a “things that could happen” list. The one linked outlines unacceptable behaviour and its consequences, and gives an avenue for complaint.

So while I don’t think it’s NumFOCUS’s job, I also don’t see any point in voting on it. We need to have one (to be NumFOCUS affiliated), so it should just be adopted from on high (following that model, with 2 or 3 diverse people monitoring the complaints email, who will assess the situation and make recommendations to the governance structure that will hopefully be set up by the time we need to deal with this).

I strongly agree with this. For some people it makes sense to be in multiple groups (and, for example, the interface groups should work together), but just because you committed to stan-math doesn’t mean you should be able to contribute to pystan. It’s a big project and different people have different skill sets.

It would be useful, however, to have a shorter joining process (ie no probation) for people who are already in one committer group.

To my mind, this is the challenge with the model that @breckbaldwin proposed. At best Stan is run in 3 separate parts (Columbia, NumFOCUS, Stan community) that can’t control each other. Whatever governance structure is developed needs to take that into account (and it doesn’t yet, as far as i can tell)

So it’s better to focus on what the open source governance structure can do that doesn’t rely on the other two arms. I guess the main thing is that if it controls the repos, then they can refuse commit privileges and not merge PRs, which seems about right to me.

This is true, but it also might be hindsight. Would something like the following work (as a minimum standard (not unlike the “how to make a PR” documentation, passing these ‘tests’ doesn’t mean you’ll be accepted, but failing means you probably won’t)

  • Code complies with project standards
  • Complete documentation exists
  • Extensive unit tests exist
  • Use-case recommendations exist (including limitations)
  • Comparisons with other algorithms on real and synthetic problems
  • A framework for the assessors to build new test cases
1 Like

I forgot that we needed a code of conduct for NumFOCUS. This feels like NumFOCUS overstepping their bounds, though with good intentions of course. If they put too many constraints on us, we may have to break off and form our own foundation.

I don’t actually think Columbia has a role in running Stan other than indirectly because they pay a bunch of us and hold the copyright to a bunch of the code. In that way, Columbia’s no different than Metrum, Novartis, University of Michigan, etc.

NumFOCUS does have some practical role because we use it for general funds outside of Columbia, so it’s generally like community money to spend. Most of it’s pretty tightly earmarked for either programmers or conferences or equipment (I think we have less than $30K that’s not earmarked).

I would very much like centralized control rather than splitting things between the Stan community and the NumFOCUS (I want to do an Andrew and respell their name as “Numfocus”, but it’s just not in my nature).

There’s no limit to the size of the NumFOCUS leadership body, only a constraint that a majority of members are not from the same institution. So there’s no reason we couldn’t in principle route everything through the NumFOCUS leadership body. The question’s then just who to put on and how unwieldy will it be when we need majority votes to pay invoices?

Well put. They’re necessary but not sufficient conditions. We constantly get into “debates” about what consitutes complaince with project standards, what constitues complete doc and how extensive unit tests should be. I think adding use-case and anti-use-case recommendations is a great idea.

As far as diversity, we have exactly one woman developer and I don’t think we have any “underrepresented minorities” by NSF’s measure. Our NumFOCUS board is more diverse, so maybe we could tap that.

@sakrejda: while it’d be nice to have all of the bits (lpdf, lcdf, lccdf, rng) for every distribution we add, I don’t think we should make it a requirement (just my opinion of course, and it contradicts what Michael suggested in the meeting). Ideally, we’d like all those things if they’re possible.

So what exactly would we do if there’s an incident that would have been covered by a code of conduct? Assuming that we don’t coward out and ignore it entirely, at the moment we’d be making stuff up as we go. In other words, a code of conduct without explicit enforcement policy built in is not that different from what we have right now.

The main point of a code of conduct is to communicate to the Stan community our intentions to maintain an inclusive environment, demonstrate various activities that we consider unacceptable, and direct people to where they can send complaints. It establishes expectations and facilities people sharing any issues they have so that we learn about any problems early and have more time to deal with them then we would otherwise.

Explicit enforcement would be great eventually, but it should absolutely not obstruct the adoption of a code of conduct.

Technically I count, but I do not think that was the diversity to which Dan was entirely referring. Having contact information for multiple people at different universities, countries, even outside of the main Stan development team, gives people options about whom to contact in case they may fear retribution from one or more of the named “complaint” contacts.

While we may not have anything as concrete as a proposal, we do agree on some elements.

Firstly I think it’s fair to say the we want governance over the direction of the project, in particular decisions about which designs to utilize and which contributions to accept. Secondly, I think everyone who has commented has agreed that we want such governance to be separated across the different repos (this would be especially good if we pull language and algorithms into their own repositories).

At this higher level I argue that we don’t want to be entirely democratic. For one, there is lots of trend chasing and while having explicit requirements for any contribution will filter much of that out it won’t filter all of it out, and any completely democratic system will leave us with regrets (a la ADVI). Plus, any such requirements have to be established (and potentially evolved) so we’d be at a chicken-and-egg situation.

We nominally agreed on repo-by-repo beneficial dictators for the time being, although we haven’t really stressed that yet. Is there a governance strategy where each repository has a convener that directs the repository and a procedure for adding members below them? Then in voting circumstances (which yes we’d all like to avoid) the convener would hold more votes (3-1? 5-1?).

Of course this doesn’t resolve many of the actual problems that we’ve had in the past, such as the design of the C++ API which crosses over the individual repositories, so we’d need some higher-level governance as well.

We have much stricter guidelines than this in practice and we should formalize them. For algorithms we’ve basically said in meetings that if something can’t pass a test suite that characterizes which models it works for it’s not getting in. We’ve also basically said that an algorithm that doesn’t fail hard is unacceptable. I realize these are not hard rules but they’re much more direct than what you are suggesting.

Good judgement even. I’m sorry if it sounds like I was putting down the previous process. I think we have a great dev team and we should learn from our mistakes. I mean, as long as we have some we should use them.

It’s really not for the benefit of people on the dev team, it’s more so that it’s easy to understand what the dev team expects, more or less.