Stan Governance


#21

@bgoodri brings up an important point.

Stan consists of multiple projects from the math lib up to ShinyStan.

The bigger question is do we want to manage them all together like one big project or manage them diffusely?

Personally, I’m much more concerned about governance for the math lib and for the Stan language and for the algorithms than for the interfaces or packages that don’t even depend on the rest of the Stan infrastructure, like ShinyStan or BayesPlot.


#22

That’s a good point, I wouldn’t even begin to assume to have much say about rstan/shinystan, although I’m happy to throw my opinion about Rcpp usage out there and help implement stuff with it.


#24

Excellent that we are getting traction with 20 or so comments, but only 2-3 votes and some abstains, we are not ready to decide I think. I’ll try and summarize commentary:

  1. Amended by @sakrejda original proposal: The electorate for Stan decision making is the set of committers in any repo on the stan-dev organization. Membership in committers is managed by the committers. The ultimate decision making is done by voting. The penultimate decision making process is consensus.

  2. Reduce/clean up electorate, @bgoodr, @ariddell, @Daniel_Simpson, @sakrejda. We have something around 41 committers at this point, this is too big in the estimation of the commenters.

  3. Expunge committers that not recent contributors @ariddell. This takes out Andrew notes @sakrejda.

  4. stan-dev organization != committers on stan-dev repos @ariddell.

  5. Governance matters most for math lib, algos and Stan language @Bob_Carpenter. Focus on the core of Stan.

  6. There is a “don’t mess with decision making” meme if you are not a part of the team in algo, rstan, pystan and so on. My suggestions that the electorate will be well behaved are not convincing.

  7. Add anything that I misunderstood or plain missed please.

My goal here is consensus–I ran a small software company and I rarely operated outside of consensus by my own estimation. It at least was my goal.

It looks like most of the concerns are about the members of the electorate but interestingly no suggestions that we not have an electorate.

I also sense that folks expect there to be a lot of voting. Remember that voting is a bad outcome. This worries me.

It is pretty clear that folks want to clean up the electorate. I would like some suggestions about what you have in mind.


#25

I agree with the spirit of the proposal, but I also think voting rights based on having committed in the past doesn’t work.

(I don’t have a great proposal that would work or else I’d put it up.)


I’m just going to comment to different things in the thread below here.

GitHub allows us to have repo-level permissions.

We’re already at a point where we value non-git-commiters. “Maintainer” doesn’t seem right and neither does “developer” because @andrewgelman and @avehtari aren’t either in the software sense. I think it may be useful to have a dedicated discussion around this point. At least having the rough spirit of what we want to do written down would be useful.

+1. @bgoodri brings up a great point here and Fogel also mentions it as “partial” vs. “full” voters. I’m with @Bob_Carpenter in wanting to be much more inclusive, but I’m happy with that being for “partial” voters and having the “full” voters be a little more controlled.

I really hope you have 0 monkeys and if you do, I really hope they’re not in circuses.

@Daniel_Simpson, just wanted to say that this isn’t true of the whole Stan team. I agree with the stability, usability, etc., but I don’t believe that Stan should discourage experimenting with algorithms. I’ve given a handful of talks describing encouraging experimentation and describing what access you have in C++. So… if you’re getting the vibe that the team discourages experimentation, maybe we could address that in a different way. Not everything will make it to the Stan repo, but we shouldn’t be discouraging people from doing whatever they want to try.

+1.

Pragmatically, it makes sense to manage them diffusely. I think the project is diverse enough where different parts of it may want to have different ways of managing it.

I’m concerned with governance with the other projects, but I think there are less concerns where there’s usually a smaller group working on it. (Or, the questions of governance aren’t visible to those not directly working on it.)


Honestly, for the most part, our group of devs have been awesome to work with! Most of the time, the path is obvious and whatever we choose for governance will work. Some of the time, I’m sure a lot of the devs will not get involved even if they have an opinion when they care less about an issue. Bob’s alluded to it, I do it all the time, I’m sure others do too. I think we are able to recognize that the project moves forward by having do-ers and that a lot of little decisions can be undone if they end up poorly.

We really need to have the governance for the worst case scenarios where there’s deadlock. I don’t think I’ve seen discussions that seem like deadlock outside of the core libraries. Ideally, we’re able to do something positive with minimal stress (good technical discussion should be encouraged). That’s what we’re lacking now.


#26

In the past problematic discussions have often meant that people who disagree with where things are headed become silent (at least on-list or on the Stan calls) rather than participating. That may be where the expectation of voting comes from. I think that’s a well-studied problem with consensus in general, that you basically can get there by peer pressure even if you’re effectively shutting people out of decision making.


#27

I’ve been following the discussion. For me and many others it’s useful to be labeled as a Stan development group member so that we can write that in grant proposals etc. For that part we don’t need to be part of the electorate. I think it’s good to have electorate even if most of the time there’s no voting. I don’t have strong opinion on the size of the electorate. I’m fine being a silent partner (joke about Finns being quiet type) without vote. I trust I can influence Stan by doing good research :)

Aki


#28

The core proposal is solid, I think. Thanks for all your work on this @breckbaldwin.

I wonder if there isn’t an existing governance setup out there, potentially one documented on github, which is aligned with the proposal. I found jupyter’s governance repo, https://github.com/jupyter/governance , with a little searching. I find the documents way too verbose but otherwise they’re not bad. Maybe there’s a better example out there.


#29

I agree with Breck that it’s good to avoid voting. We had this problem with Stancon, where Michael had a plan of holding the conference at that hotel in California, and there was lots of concern, and Michael then pushed everyone to vote. This was not the whole Stan organization, it was the Stan Numfocus board.

Setting aside the details of that decision, that experience did make me worry about a setting where voting is used as a substitute for discussion. That is, in that case, the possibility of voting actually made it more difficult to attain consensus.

I’ve generally been pleased about our consensuses. Knowing that we need to get to consensus helps our conversations move forward, I think.

Regarding the “who gets to vote” issue: I’m not so sure that matters, in that I can’t imagine we’d have a key issue decided on a 28-17 vote, or whatever. This relates to Bob’s concern about issues where two developers have different views of how to go forward. In such settings, I think it would make more sense to bring in core people one by one to build a consensus, or come up with a third option.

To put it another way, I think there’s a limited amount of power that can be held by a large group. I could see the group of 40 or so developers being relevant for conducting a “straw poll,” and for that matter it could also be useful to think about polling users on occasion, not as a way to make decisions but as a way to inform the decisions we make.


#30

I believe that this conversation has become confused because the scope of the initial governance proposal was not properly defined. At the Stan developer meeting a few weeks ago we had a long discussion about governance and ultimately decided to separate out the long term guidance of the project (nominally to be run via some board of benevolent dictators) from the developer logistics. Anyone please correct me if I am wrong.

In particular, the specific governance being discussed here was about the developer logistics alone. This means is has nothing to do with politics (being able to claim that you’re a Stan developer for a CV or grant proposal) or deciding between overarching designs or features. The issue at hand here is simply who is allowed (or rather who do we trust) to review and approve pull requests? Given how GitHub permissions is currently set up this also has the side effect of deciding who can create branches within the Stan repositories.

Let me try clarifying a few things admixed with what I propose as a governance model consistent with already stated opinions.

What do we call the cohort of people who can review and approve pull requests?

Stan developer sounds correct to me, although if we formalize this then it would nominally prevent anyone who isn’t currently reviewing pull requests (Aki and Andy, for example) from calling themselves Stan developers.

  • Option 1: We call the cohort something else, like “maintainers” or “committers”.
  • Option 2: We call the cohort “developers” and have a separate name for collaborators, such as “advisors” or “collaborators”.

The initial cohort would be chosen by fiat, likely people around Columbia. People can then be quickly added (see the proposed procedure below).

Where is this cohort defined and managed?

  • Option 1: A team of the stan-dev organization, such as the current “Maintainers” team.
  • Option 2: The entire stan-dev organization.

Note that the any team must be a subset of the stan-dev organization, but going with Option 1 would give us additional flexibility, for example not giving everyone permissions on all repos (in particular the Jenkins repo which contains sensitive passwords).

How are people added to this cohort?

Anyone not in the cohort can still contribute to Stan, just not exactly in the workflow that we currently have. Outside contributors would have to fork the target repository instead of branching, and then submit a pull request from that fork. Jenkins would not automatically run CI on these requests to avoid overwhelming our system; instead a member of the cohort would have to trigger the CI by hand (likely the person doing the initial review).

But how to contributors become members of the “developer” cohort where they can create branches and get auto CI and review and approve other pull requests? This was nominally the original issue at hand. May I propose the following:

1. Nomination

Anyone who has submitted a pull request (from a fork) may be nominated for inclusion into the “developer” cohort by any member of the cohort.

2. Probationary Approval

Any nominee who receives N number of votes is given probationary approval where they are added to the cohort with all permissions to create branches, trigger CI, review and approve pull requests, but not vote. I imagine that most members of the cohort would abstain via nonresponse, so we should set N relatively low, such as 3 or 5.

3. Full Approval

After the probationary period of P months another vote is taken and if the nominee receives N vote again they become permanent members of the cohort with full voting privileges. I think we want this period long enough for the nominee to engage with the cohort and do a few reviews, maybe P = 6?

This step is here in the rare instance where a nominee clashes significantly with the existing cohort, giving us a way to remove someone who may be acting poorly without having to “vote them off”.

4. Code of Conduct

If we are formalizing governance then I think we should also add a developer/project code of conduct just to acknowledge that we want to maintain a civil and inclusive environment.

5. Membership Expiration

A final consideration is whether membership in this cohort might expire, for example after 2 years without having submitted or reviewed a pull request. This is optional but I think should be considered as after 2 years our code will likely have changed enough that members who have not engaged in a while may no longer be sufficiently familiar with it.

To simplify the proposal we can also drop (5) entirely and worry about it later.

Any suggestions on the exact details here, such as which GitHub grouping to use, the name of the cohort, the number of votes, and the length of the probationary period? Or just general comments?


#31

Michael:

This proposal seems over-engineered to me.


#32

@andrewgelman :
It has all the things that are needed. A definition of the group, a way in, a “sense check”, and a way out. I’m not sure it could be simpler (without a benevolent dictator who can fix things in the event that they break). Well maybe the probation isn’t necessary (and without that this mirrors the numfocus structure), but it seems like a decent idea. The obvious downside to it is that there needs to be 2 votes per new developer, which might get tedious if there are a lot of developers.

The only thing that I disagree with is the initial definition of the cohort. There’s no reason “developers” can’t be (Andrew + Aki) + (People who currently review PRs). It doesn’t make sense not to include them, but they aren’t really a generalisable category, so it’s not worth expanding the definition so that people like Aki and Andrew are included. Should a new Aki/Andrew come along, they can be nominated by standard procedure.


#33

I’m fine with this proposal, especially if it can be simplified a bit
more. I think we might be able to get away with sections 1 and 2
(merging 2 and 3).

Here’s my reasoning. The ability to fork when stuff goes wrong and the
fact that the Stan trademark lies in the hands of a different
organization allows erring on the side of being inclusive. We could, I
think, do without the probationary period and the expiration clause. (In
defense of the expiration clause, transitioning devs to alum status
could be fully automated.) I think we also should remove the code of
conduct clause because it applies to everyone, not devs in particular.
It applies to conference attendees, people submitting issues and PRs, etc.

“Developer” is a good name.


#34

Thanks for laying this all out so clearly, Michael.

Yes, I thought this was exclusively about code.

Michael’s proposal sounds largely reasonable to me and not at all overengineered. Here are my answers to his specific questions:

  • What do we call these people?

    • I like “reviewer” as it’s most direct and descriptive (we could go with “editor” if we want to make it uber-academic); I don’t like “maintainer” as it seems too backward and I don’t like “committer” as that’s different in the Git world and that’s not even what we’re talking about
    • Let’s leave the “developer” list the vague amorphous mass it is to cause as little disruption as possible
  • Defined and managed

    • team basis, and we should break the teams down largely by repos
  • How added

    • I like this as is
    • we can just make approval a majority of those voting
    • 2 years sounds reasonable to me
  • Code of conduct

    • yes, please—this is really separate from membership in the reviewer club

#35

I think Michael’s proposal seems fine, I’d go for both option 1s.

Andrew’s post about calling a vote as a way of cutting off discussion raises a good point. But on the other hand, calling a vote could serve to end a filibuster-style war of attrition… Hopefully we don’t need full senate rules. I’m not sure what to do about this except include some language suggesting that these procedures are last resort?


#36

To be clear, I am advocating for putting in a code of conduct for the entire project, as others have suggested, and just using this as an excuse to do it. If there is any debate about it we can leave it to a separate discussions and move forward here alone.


#37

The only voting rules we are discussing here are votes to include people into the developer cohort, and there is a vote really only to define a minimum threshold of developer support (i.e. to avoid somebody getting in because one person approved and everyone else was too busy or didn’t comment in time).

I don’t expect that we’d ever have to worry about contentious issues here – those would all be the responsibility of whatever guiding body we put together, the discussion of which we’ve punted for later.


#38

OK, all sounds good to me. Thanks, all, for clarifying.


#39

@betanalpha Code of conduct is interesting and important. Do we have one?


#40

@breckbaldwin I may be wrong, but as a numfocus afficilated project, whatever part of Stan that covers probably has this COC https://www.numfocus.org/about/code-of-conduct/

If not, it’s a fairly standard (in a good way, not in an indifferent way) COC.


#41

We appear to not be ready to vote/consense so I drop the previous suggested deadline of today. I’ll comment more in the next post to just keep things modular.