Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For comment: Roslyn team SLA for addressing community PRs and bug submissions #26266

Open
MattGertz opened this issue Apr 19, 2018 · 23 comments
Open
Assignees
Labels
Area-External Contributor Pain The issue impedes progress for project collaborators. Discussion
Milestone

Comments

@MattGertz
Copy link
Contributor

MattGertz commented Apr 19, 2018

It's been nearly four years since we moved Roslyn to open source, and during that time we’ve worked on refining our processes and infrastructure to better enable and support it. Much of that effort has been focused on the CI (continuous integration) train, particularly focused on automated testing.

One thing that we recognize that we haven’t done as well as we’d like is the processing of community-submitted PRs and bugs. In fact, we have over 200 unprocessed PRs, many of which are quite old, along with rather more bug reports which are still open. We’ve attempted several times to impose more rigor on getting these processed. However, for various reasons (but ultimately due to the availability of engineers to drive through the backlog vs. other high priority work in Visual Studio), we haven’t been able to sustain those efforts. We want to do better here.

In discussing this in one of our weekly Roslyn leads’ meetings, we agreed that previous attempts to make progress here were too ad hoc. We further agreed that this type of community support can only succeed if we bake in time to each sprint to address it. To that end, we want to start by creating an SLA to which we would hold ourselves accountable by setting goals and reporting progress against them in a transparent way each sprint, which then would allow us to better build in cost for support each sprint after a few sprints’ worth of data.

What follows is a draft of that plan, reviewed by the leads of the Roslyn team and cross-checked by managers in peer organizations. In the spirit of open source, we would like community & Roslyn team feedback on it before ratifying it.

PRs:

While the Roslyn team has the right and obligation to move Roslyn in directions which might conflict with community-submitted PRs, we should not leave PRs hanging without any feedback at all. We now have accumulated a lot of PRs, many of which are old and possibly not even relevant to the state of the codebase. To address this, we are considering the following:

Existing PRs:

  1. We would reject all current PRs older than six months, with apology. This does us no credit, but it is a realization that the codebase will have changed significantly with regard to many (but not all) of the PRs involved. Those PRs would be rejected “without prejudice,” meaning that the owner could re-submit them against the current state of the code, updating the PR as appropriate.
  2. The remaining backlog of existing PRs would be triaged against the process below as if they had been newly submitted.

New PRs:

The lead developer for each area of the repo will do first-pass triage community PRs within a week of submission for the PRs in their area. (In practice, this will probably mean that a lead will set aside one block of time each week for this.) The first pass will result in the following visible status update within that week:

  1. Approved by Lead: The PR clearly meets the bar (i.e. it fixes a meaningful problem, has associated unit test(s), and does not cause a deviation from planned product direction).
    • The PR will be absorbed into the next available branch which is not in a stabilization phase.
    • The PR will be assigned a Microsoft “buddy” on the team who will help mentor it through. Besides helping with code and process, the buddy will look for any additional testing needs (our existing test scenarios may not cover this fix and it should not be assumed that the new/changed scenario will be covered in our daily testing) and also look for additional regression prevention required that cannot be fulfilled through unit testing
    • If completion is blocked by unforeseen complications involving the team’s internal infrastructure or code collisions, the buddy will keep the submitter informed, loop them in to address issues, etc.
    • During that PR completion phase, the PR buddy will get back to the contributor within three business days to review any follow-ups (the buddy will delegate that to a fellow team member if away).
    • Conversely, if the contributor has been asked for updates or clarifications, and the buddy has not heard back from them in one month, the PR will be assumed to be abandoned (and thus rejected) unless other time limits have been agreed upon with the PR buddy.
  2. Pending discussion: There are questions about the PR that need to be resolved – code needs to be changed, unit tests are missing, is unclear on whether it is in the direction we want to take the codebase, or we need clarification on the purpose of the PR.
    • The lead will make it clear as to what needs to be done or answered.
    • A “buddy” will be assigned to assist the contributor if they have any questions, but the onus is on the contributor to follow up on any questions/concerns/code within two weeks from the last question unless otherwise agreed upon.
    • PRs need to keep up with the rest of the codebase during this time.
    • Discussions will always take place in the GitHub issue except in the rare case where some part of the discussion needs to happen internally due to legal issues that we cannot control (e.g., code change collides with undisclosed Microsoft IP).
  3. Rejected: The PR is not in the direction we want to take the codebase, has too much risk, or does not solve a problem of meaningful priority. (Or, in the case of the existing backlog, the PR needs to be brought up-to-date.)
    • The lead will make it clear why the PR is being rejected.
    • PRs can be resubmitted “without prejudice;” however, realistically such PRs would not be approved unless significantly changed (or, in the case of the existing backlog, updated).

PR report:

The Roslyn leads will each produce a brief report each sprint which shows how we are tracking against PRs for their area (PRs completed in bounds, PRs in discussion, PRs out of bounds, PRs rejected). These areas include Productivity (IDE/Debugger/Analyzers), Compiler & Language, Project System, and .NET Core CLI/SDK.

Bugs:

Reported bugs from customers (w/o PRs) in GitHub will be treated identically to any other bug, whether reported from customer via the Visual Studio Feedback mechanism, discovered internally by the team, etc. Specifically:

  • During the normal daily bug triage, each lead will prioritize the bug.
    • High priority bugs (crash, hang, data loss, regression, security, major malfunction) will get triaged to a Microsoft developer and put in their queue to address in the current milestone.
    • Small malfunctions will be marked as such and slotted/assigned when developers become available, and will be moved to a backlog milestone mapping to no particular release.
    • Minor issues (spelling errors, fit-n-finish, small issues with easy/obvious workaround) will be marked as “Up for Grabs” for community PR contributors, and will be moved to a backlog milestone mapping to no particular release.
    • Even if a bug is not marked “Up for Grabs,” we welcome community contributions on it (but if it’s assigned to someone else, check with them first) regardless of priority.
    • In triage, bugs may also be marked as “Not repro” (which can be cleared if a repeatable repro is found) or “Won’t Fix” if the lead determines that the bug has a valid reason to be punted (too risky, code is scheduled for obsoletion, etc.) (Although we are using the term “bugs” here in this SLA, the same process would be relevant to suggestions for improvements, etc.)
  • The amount of bugs that the team can fix is constrained by the availability of engineers as well as shipping schedules. (Community contributions help with that, but can’t totally offset that.)
  • Because of this, we gradually amass some non-trivial number of minor bugs that we are unlikely to ever get to, and which become increasingly irrelevant as the product changes. This, in turn, makes it difficult for the team to understand our actionable bug debt when scheduling.
  • Therefore, the team periodically clears out (“Won’t fix”/”Closed”) the collection of minor bugs which haven’t met the bar for fixing.
    • We don’t have a hard-and-fast schedule for doing such a pass through the bugs, but the rule of thumb would be about once a year.
    • These “won’t fixes” can be reactivated if subsequently discovered to be more important, on a per-instance basis, but in general, closing an issue will mean an end to any consideration of it.

Please discuss. The goal is to have a plan in place by May 15, 2018.

@MattGertz MattGertz added Discussion Contributor Pain The issue impedes progress for project collaborators. labels Apr 19, 2018
@MattGertz MattGertz self-assigned this Apr 19, 2018
@MattGertz
Copy link
Contributor Author

I should note that we've already begun the work on identifying the older PRs that need rebasing and looping back to contributors, so as to get that in front of contributors as early as possible (thanks, @jaredpar !).

@CyrusNajmabadi
Copy link
Member

@MattGertz Sounds great. My pieces of feedback (which may be covered above, but i missed it), are around the expected communication times provided. Specifically, that when the SLA allows for long amounts of times (for example, multiple weeks), if the buddy could still check in on somewhat of a reasonable bases (maybe weekly would make the most sense) just to confirm things are on the radar and are proceeding as planned.

Right now there's a large need for contributors to continuously 'ping' the team to try to get eyes on a PR. I've had to do it many times myself, often waiting several weeks between pings to not be too annoying, but finding it necessary because there has been zero communication in the other direction.

--

Also, while i firmly expect other work of the 'buddy' to be important and mentally consuming, i would impress on the value in being to have quick turnaround and communication on a PR. For example, several of my own PRs have been difficult to deal with as i've only gotten eyes on them sporadically over the months. Each time that happens, i myself have to 'page in' all my understanding and context, and it makes the work much harder. While i would never expect that these PRs occupy too much of a buddy's time, i think it would be appreciated if buddies helped try to 'close the loop' on PRs when possible by engaging more continuously. In other words, while 'three days' is def a nice SLA to have, i would impress on the team that if it would only take a couple of minutes for a buddy to respond to something simply, that sort of responsiveness would be very appreciated and will help out the PR process enormously.

Thanks!

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Apr 19, 2018

Another small niggle:

We would reject all current PRs older than six months, with apology.

You need to specify what you mean by "older than six months". Right now it seems like you're closing PRs that were created more than six months, even if there has been active work on them. Can you instead only close PRs that have not been touched in 6 months?

Thanks! :)

@jnm2
Copy link
Contributor

jnm2 commented Apr 19, 2018

We would reject all current PRs older than six months, with apology. This does us no credit, but it is a realization that the codebase will have changed significantly with regard to many (but not all) of the PRs involved. Those PRs would be rejected “without prejudice,” meaning that the owner could re-submit them against the current state of the code, updating the PR as appropriate.

Rather than re-submitting, why not suggest that the author reopens the PR after resolving merge conflicts and pushing?

@CyrusNajmabadi
Copy link
Member

Rather than re-submitting, why not suggest that the author reopens the PR after resolving merge conflicts and pushing?

Note: i cannot seem to reopen my PRs that were closed. This is also problematic as i do not want to open new PRs which would then not have all the comments and discussion from the original. I've pinged jared to reopen the ones that are still relevant.

@jnm2
Copy link
Contributor

jnm2 commented Apr 19, 2018

I didn't know this. From GitHub support via isaacs/github#361:

We're blocking the pull request reopen if the current head isn't a descendant of the stored head sha (which is what the head was when the pull request was closed). We are not allowing the reopen in that case, because there is no good way to tell what changes have happened while a pull request was closed and the head branch has changed.

So the PR author should first reopen, then force push.

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Apr 19, 2018

That's really unfortunate. I'd been working on this PR #10873 with Neal, and had just done more work on it in the last week based on the latest pattern changes he'd made. Needing to reopen, which would lose all the discussion and whatnot is not great.

@CyrusNajmabadi
Copy link
Member

I think a far more appropriate and gentle approach would have been to give some warning on this. If these steps were very destabalizing for the contributor, it's not very nice to give zero warning and then have it foisted on them. Even giving a week with a message of "we're going to close this unless you pull this forward" would have been appreciated. That would be esp. good given that the SLA is still in first-draft form and could easily change in the near term.

I would ask that you please consider that in the future when enacting any policies that can have this sort of impact on a contribitor's work. Just a little heads up goes a long way and can prevent unnecessarily painful situations, without really costing the team anything.

Thanks!

@CyrusNajmabadi
Copy link
Member

@CyrusNajmabadi All you have to to is put your new work in a temp branch, force push d9a8bf3 back onto the PR branch, reopen the PR, and force push the new work from the temp branch to the PR branch.

Ah, ok. I didn't realize it could go back to the original PR. I thought a new PR was required. That's much more palatable.

@gafter
Copy link
Member

gafter commented Apr 25, 2018

We have a problem right now that we don't always do a good job of assigning incoming issues to a team, and under the proposed plan such issues would fall through the cracks. See currently uncategorized issues HERE.

@CyrusNajmabadi
Copy link
Member

Another important issue: it's often difficult to get the requisite two reviews. Leading to the situation where a PR is reviewed by one person at one time, but then may go weeks until reviewed by another. Then based on what the second reviews lead to, changes may be necessary that restart the entire process with both reviewers.

I'm not sure what can be done about this, but generally having the team be open to reserving some amount of time to do reviews would be valuable. Note: this was a particular pain even when i was on the team. It was often difficult to find people who could give any time toward a review. That doesn't seem particularly healthy as it leads to only an extremely slow dribble of improvements to the codebase.

@MattGertz
Copy link
Contributor Author

Update: yesterday Jinu and I walked through the remaining PRs to be assigned out to buddies. We are getting very close to having that step completed.

@CyrusNajmabadi Sorry for the late response. Great point. For simple changes, I'm open to that being a call by the buddy without the second review to improve velocity -- lemme just check with my leads to be sure that no one has a concern about that. For bigger stuff (i.e., pretty much anything you personally write :-)) I'd still want the second review, though. (Lotsa code comments helps make that go faster...)

@Neme12
Copy link
Contributor

Neme12 commented Oct 1, 2018

Are there any updates on this?

@jinujoseph
Copy link
Contributor

@Neme12 what update are you looking for. We are currently following this progress, where community PR are assigned out to one of the buddies and we work together to get it in on time.
Pls bring forward if there are any concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-External Contributor Pain The issue impedes progress for project collaborators. Discussion
Projects
None yet
Development

No branches or pull requests

7 participants