Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for an explicit GitHub workflow. #29

Merged
merged 26 commits into from
Jul 1, 2020
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
250eeee
Proposal for an explicit GitHub workflow.
chandlerc May 27, 2020
b1ab93f
Addressing review comments.
chandlerc May 28, 2020
9df8f00
Merge branch 'master' into flow-linear
chandlerc May 30, 2020
f64d62c
Minor tweaks
chandlerc May 31, 2020
01491d1
Fix typo in CONTRIBUTING.md
chandlerc Jun 2, 2020
8e5e7e3
Extract workflow description and remove redundancies.
chandlerc Jun 2, 2020
e7f63a1
Merge branch 'master' into flow-linear
chandlerc Jun 3, 2020
4efe332
remove tracking issue template field
chandlerc Jun 3, 2020
71cb4a5
Update proposals/p0029.md with reviewer suggsetion.
chandlerc Jun 3, 2020
833e9b5
Continue to address review feedback.
chandlerc Jun 3, 2020
0a414f6
Apply suggestions from code review
chandlerc Jun 10, 2020
8075748
Incorporate code review feedback and begin moving toward "trunk" base…
chandlerc Jun 10, 2020
d07bc20
Merge branch 'master' into flow-linear
chandlerc Jun 10, 2020
2a9487d
Tweak the wording and make it a bit more consistent.
chandlerc Jun 10, 2020
934f708
Update docs/project/pull_request_workflow.md
chandlerc Jun 11, 2020
0bbed4a
Address more review comments.
chandlerc Jun 11, 2020
0fb8618
Correct the rationale.
chandlerc Jun 12, 2020
a5f6a2f
Merge branch 'master' into flow-linear
chandlerc Jun 17, 2020
e1ed39e
Tweak wording based on discussion in review.
chandlerc Jun 17, 2020
68c6adf
Improve the rationale around the default branch to avoid overstating or
chandlerc Jun 17, 2020
8d25246
Apply suggestions from code review
chandlerc Jun 20, 2020
a5dddbd
Clean up formatting and address a couple of comments on grammar from …
chandlerc Jun 20, 2020
10ce7a6
Merge branch 'master' into flow-linear
chandlerc Jun 20, 2020
f548bc1
Update docs/project/pull_request_workflow.md
chandlerc Jun 30, 2020
e9418e8
Merge branch 'master' into flow-linear
chandlerc Jul 1, 2020
8d3ff9a
Add to proposal list.
chandlerc Jul 1, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,6 +245,13 @@ Every file type uses a variation on the same license text ("Apache-2.0 WITH
LLVM-exception") with similar formatting. If you're not sure what text to use,
please ask on Discourse Forums.

## Workflow

Carbon repositories all follow a common
[pull-request workflow](docs/project/pull_request_workflow.md) for landing
changes. It is a trunk-based development model that emphasizes small,
incremental changes and preserves a simple linear history.

## Acknowledgements

Carbon's Contributing guidelines are based on
Expand Down
116 changes: 116 additions & 0 deletions docs/project/pull_request_workflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
# Trunk-based pull-request GitHub workflow

<!--
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
Exceptions. See /LICENSE for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

Carbon repositories follow a few basic principles:

- Development directly on the `trunk` branch and
[revert to green](#green-tests).
- Always use pull requests, rather than pushing directly.
- Changes should be small, incremental, and review-optimized.
- Preserve linear history by
[rebasing](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-request-merges#rebase-and-merge-your-pull-request-commits)
or
[squashing](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-request-merges#squash-and-merge-your-pull-request-commits)
pull requests rather than using unsquashed merge commits.

These principles try to optimize for several different uses or activities with
version control:

- Continuous integration and bisection to identify failures and revert to green.
- Code review both at the time of commit and follow-up review after commit.
- Understanding how things evolve over time, which can manifest in different
ways:
- When were things introduced?
- How does the main branch and project evolve over time?
- How was a bug or surprising thing introduced?

Note that this isn't a complete guide to doing code reviews, and just focuses on
the mechanical workflow and branch management. TODO: Add an explicit link to
more detailed guidance on managing pull request based code reviews when it is
developed.

## Trunk based development

We work in a simple
[trunk-based development](https://trunkbaseddevelopment.com/) model. This means
all development activity takes place on a single common `trunk` branch in the
repository (our default branch). We focus on
[small, incremental changes](#small_incremental_changes) rather than feature
branches or the "scaled" variations of this workflow.

### Green tests

The `trunk` branch should always stay "green". That means that if tests fail or
if we discover bugs or errors, we revert to a "green" state by default, where
the failure or bug is no longer present. Fixing forward is fine if that will be
comparably fast and efficient. The goal isn't to dogmatically avoid fixing
forward, but to prioritize getting back to green quickly. We hope to eventually
tool this through automatic continuous-integration powered submit queues, but
even those can fail and the principle remains.

## Always use pull requests (with review) rather than pushing directly

We want to ensure that changes to Carbon are always reviewed, and the simplest
way to do this is to consistently follow a pull request workflow. Even if the
change seems trivial, still go through a pull request -- it'll likely be trivial
to review. Always wait for someone else to review your pull request rather than
just merging it, even if you have permission to do so.

Our GitHub repos are configured to require pull requests and review before they
are merged, so this rule is enforced automatically.

## Small, incremental changes

Developing in small, incremental changes improves code review time, continuous
integration, and bisection. This means we typically squash pull requests into a
single commit when landing. We use two fundamental guides for deciding how to
split up pull requests:

1. Ensure that each pull request builds and passes any tests cleanly when you
request review and when it lands. This will ensure bisection and continuous
integration can effectively process them.

2. Without violating the first point, try to get each pull request to be "just
right": not too big, not too small. You don't want to separate a pattern of
tightly related changes into separate requests when they're easier to review
as a set or batch, and you don't want to bundle unrelated changes together.
Typically you should try to keep the pull request as small as you can without
breaking apart tightly coupled changes. However, listen to your code reviewer
if they ask to split things up or combine them.

While the default is to squash pull requests into a single commit, _during_ the
review you typically want to leave the development history undisturbed until the
end so that comments on any particular increment aren't lost. We typically use
the GitHub squash-and-merge functionality to land things.

### Managing pull requests with multiple commits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this section confusing, and I'm afraid I don't have a concrete suggestion for a change. I think the issue may just be that I'm having trouble mapping the English text in these two paragraphs into a series of concrete actions. If splitting up a single pull request into separate commits via rebasing is common, then perhaps a link to an external explanation is all it would take.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not found a nice, simple explanation to link to. I've reworded it a bit to help, but I'm still thinking about simpler ways to explain, if it is even worthwhile....


Sometimes, it will make sense to _land_ a series of separate commits for a
single pull request through rebasing. This can happen when there is important
overarching context that should feed into the review, but the changes can be
usefully decomposed when landing them. When following this model, each commit
you intend to end up on the `trunk` branch needs to follow the same fundamental
rules as the pull request above: they should each build and pass tests when
landed in order, and they should have well written, cohesive commit messages.

Prior to landing the pull request, you are expected to rebase it to produce this
final commit sequence, either interactively or not. This kind of rebase rewrites
the history in Git, which can make it hard to track the resolution of code
review comments. Typically, only do this as a cleanup step when the review has
finished, or when it won't otherwise disrupt code review. It is healthy and
expected to add "addressing review comments" commits during the review and then
squashing them away before the pull request is merged.

## Linear history

We want the history of the `trunk` branch of each repository to be as simple and
easy to understand as possible. While Git has strong support for managing
complex history and merge patterns, we find understanding and reasoning about
the history -- especially for humans -- to be at least somewhat simplified by
sticking to a linear progression. As a consequence, we either squash pull
requests or rebase them when merging them.
157 changes: 157 additions & 0 deletions proposals/p0029.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
# Linear, rebase, and pull-request GitHub workflow

<!--
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
Exceptions. See /LICENSE for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

[Pull request](https://github.com/carbon-language/carbon-lang/pull/29)

## Problem

There are a variety of workflows that can be effectively used with both Git
version control and GitHub. We should have a common and consistent workflow
across as much of Carbon as possible. While some specific areas may end up
needing specialized flows to suit their needs, these should be the exceptions
and can be handled on a case-by-case basis when they arise.

## Background

- Chapter 16 "Version Control and Branch Management" in the SWE book
(_[Software Engineering at Google](https://www.amazon.com/Software-Engineering-Google-Lessons-Programming/dp/1492082791)_)
- [Trunk Based Development](https://trunkbaseddevelopment.com/)
- GitHub documentation on
"[pull requests](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)"
- [Configuration of their merges](https://help.github.com/en/github/administering-a-repository/configuring-pull-request-merges)
- [Protected branches](https://help.github.com/en/github/administering-a-repository/about-protected-branches)

## Proposal

Carbon repositories should all follow a common
[workflow](/docs/project/pull_request_workflow.md) for managing and landing
changes. The document outlines the specifics of the proposed workflow as well as
the core motivation.

### Rename the default git branch to `trunk`

This achieves two goals:

1. Replaces the term `master`. This term, while only used in isolation in Git,
[was used](https://mail.gnome.org/archives/desktop-devel-list/2019-May/msg00066.html)
in immediately preceding and related systems as part of extremely problematic
"master/slave" terminology. That background associates the term with
unacceptable historical and cultural meanings. The intent of those using or
adopting the term isn't relevant to this association. The less overtly
problematic term being isolated from the rest doesn't erase its history, and
doesn't completely avoid painful associations.

2. It directly anchors and reinforces contributors on the trunk-based workflow.

### Longer discussion of linear history

While Git can effectively bisect through complex history graphs, this is
significantly harder than bisecting across linear history. Especially for any
part of Carbon involving code, we expect bisection to be a pervasive tool and we
can make it simpler and more effective by forcing a linear history.

A linear history also makes it much easier to ask questions about whether a
particular change has landed yet, or when a bug was introduced. For some people,
releases are more simply understood as branching from a specific snapshot of the
linear history. While tools like `git log` can provide similar functionality, it
is less trivially understood.

Continuous integration is simplified for many of the same reasons as bisection:
the set of potential deltas is reduced to a linear sequence. Reverting to green
becomes easier to understand, and testing each incremental commit has a single
obvious interpretation.

Requiring linear history also incentivises incremental development that is
committed early to the main branch. This, in essence, ensures a single source of
truth model even with the distributed version control system. Because
works-in-progress are required to be rebased, they tend to merge early and often
rather than forming long-lived development branches. This helps reduce the total
merge resolution and testing costs as a project scales. For more details about
the advantages of using a single source of truth, see the full text of the
"Version Control and Branch Management" chapter in the SWE book.

One concern with linear history when rebasing a sequence of changes and merging
them is that the pull request associated with that sequence might not be obvious
from the main branch commit log. However, there is enough information in the
repository to establish the relationship, and GitHub's UI surfaces the pull
request on each commit in the series.

## Alternatives considered

### LLVM model

LLVM allows directly pushing/submitting to their "trunk" with post-commit
review. LLVM enforces linear history for day-to-day development. Merge commits
are allowed for rare events like contributing a new subproject.

Pros:

- Still has linear history.
- Incentivizes squashing for continuous integration and bisection.
- Very low overhead for fixing trivial mistakes.

Cons:

- Creates extremely bad incentives around code review.
- Lots of patches don't get pre-commit review, even if they would benefit from
it.
- Very experienced contributors are much better at avoiding pre-commit review,
so are rarely blocked waiting on review.
- Leads to the most experienced members of the community not doing enough code
reviews, or being timely enough in code reviews.
- Lots of patches submitted with post-commit review are never reviewed in
practice unless they break something.
- UI and basic support for code reviews entirely focused on pull requests.

### Fork and merge model

Classically, Git and GitHub support merging entire complex graphs of
development.

Pros:

- Mostly supported by pull requests, so still able to use much of that
functionality.
- Supports a model in which contributors do not communicate and can each develop
a local, decentralized fork while still achieving overall reconciliation.
- Can model much more complex history of code evolution faithfully in the
tooling.
- Most of the time these aren't so complex that they create problems for
humans.

Cons:

- History is harder for humans to understand and reason about in complex cases.
- Bisection and continuous integration are more complex.
- May create difficulty for continuous integration against mainline, because
unclear what "order" they should be applied / explored. While there are
technical approaches to address this, the don't seem to eliminate the
complexity, merely provide a clear set of mechanics for handling it.
- Reduces incentives to land small, incremental changes by allowing
non-linearity to reduce the effort required for large and/or complex merges.
- Makes review of the main branch's history harder due to non-linearity.

### Fork and merge, but branches can only have linear history

Imagine a fork and merge model, but PRs can only have linear history. That is,
branching and merging within a PR, or merging `trunk` into PR is not allowed. In
this model, the only merge commits are merges of PRs into `trunk`. PRs
themselves don’t contain merge commits.

Pros:

- Mostly supported by GitHub pull requests, so still able to use much of that
functionality.
- Restricts non-linearity of history. The only non-linearity that is left is
merge commits on the trunk branch. PRs themselves can’t contain merge commits.

Cons:

- Requires a custom presubmit on GitHub that checks linearity of a PR.
- The cons of the fork and merge strategy regarding the complexity of history
remain, but to a smaller degree, since non-linearity is restricted.