Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed change to coding guidelines: squashing commits #660

Closed
mr-c opened this issue Nov 16, 2014 · 23 comments
Closed

Proposed change to coding guidelines: squashing commits #660

mr-c opened this issue Nov 16, 2014 · 23 comments
Assignees
Milestone

Comments

@mr-c
Copy link
Contributor

mr-c commented Nov 16, 2014

During the bulk of the development of a pull request I think the commits should fly fast and furious. However I would like to see them squashed down to a handful of commits (perhaps 1) during the refinement phase prior to merging.

Thoughts?

@ctb
Copy link
Member

ctb commented Nov 16, 2014

Ref http://blogs.atlassian.com/2013/10/git-team-workflows-merge-or-rebase/. In the context of khmer, I think the second sentence under "Decisions, decisions, decisions" is an important consideration:

 If you and your team are not familiar with, or don’t understand the intricacies of rebase, then you    
 probably shouldn’t use it. In this context, always merge is the safest option.

@mr-c
Copy link
Contributor Author

mr-c commented Nov 17, 2014

To clarify: we would still do merges to incorporate a pull request (thus documenting what happened). What I'm advocating for is "rebase as cleanup" prior to the merge.

"When a feature branch’s development is complete, rebase/squash all the work down to the minimum number of meaningful commits" and then do a merge using the button in GitHub

@ctb
Copy link
Member

ctb commented Nov 19, 2014

I'm still not sure how all the moving parts interact in a pull request. Run through, tutorial, documentation?

I'm pretty worried about raising the bar on contributors and adding additional requirements; it's pretty discouraging to receive the message "you did everything right, but here's one more thing..." for the umpteenth time.

@luizirber
Copy link
Member

I like the idea of maintaining history clear, but squashing and rebasing is not trivial.

Might be a good resource to link on the docs: http://robots.thoughtbot.com/git-interactive-rebase-squash-amend-rewriting-history

@camillescott
Copy link
Member

I also think rebasing is a bad idea in general; for example, yesterday, because of the rebase and squash on the seqan branch, @mr-c and I sat there for an hour trying to fix the multiprocessing branch which had merged in seqan before the rebase. Rebasing is not only confusing (let alone for beginners), it has the potential to seriously compromise integrity. It also can silently drop changes if the rebaser makes a mistake during the interactive process.

Instead, I think we need to just encourage better commit policy, or deal with the potential for many small commits. I can see no real reason other than pedantry why a less than perfect commit history is a problem anyway.

@camillescott
Copy link
Member

That is to say: not only do I think we shouldn't require squashing / rebasing, I think we should discourage any rebasing on all publicly facing branches.

@ctb
Copy link
Member

ctb commented Nov 22, 2014

Any more discussion? At this point I'm leaning towards requesting clear commit messages and that's it.

@ctb
Copy link
Member

ctb commented Jan 30, 2015

I've also now dealt with several merge conflicts caused by rebasing and squashing on branches in PRs. It's a waste of my time and a potential source of bugs when playing around with other people's branches, and I think it needs to be strongly discouraged.

@mr-c
Copy link
Contributor Author

mr-c commented Jan 31, 2015

Maybe we all need a rebasing, merging, and cherry picking primer? People
should be able to do what they want with their own branches.

On Fri, Jan 30, 2015, 13:11 C. Titus Brown notifications@github.com wrote:

I've also now dealt with several merge conflicts caused by rebasing and
squashing on branches in PRs. It's a waste of my time and a potential
source of bugs when playing around with other people's branches, and I
think it needs to be strongly discouraged.


Reply to this email directly or view it on GitHub
#660 (comment).

@ctb
Copy link
Member

ctb commented Jan 31, 2015

On Sat, Jan 31, 2015 at 12:23:21AM -0800, Michael R. Crusoe wrote:

Maybe we all need a rebasing, merging, and cherry picking primer? People
should be able to do what they want with their own branches.

As soon as a branch is merged into another branch (as happened with HLL
and some of the SeqAn stuff) squashing and rebasing seems to cause problems.

On Fri, Jan 30, 2015, 13:11 C. Titus Brown notifications@github.com wrote:

I've also now dealt with several merge conflicts caused by rebasing and
squashing on branches in PRs. It's a waste of my time and a potential
source of bugs when playing around with other people's branches, and I
think it needs to be strongly discouraged.

???
Reply to this email directly or view it on GitHub
#660 (comment).


Reply to this email directly or view it on GitHub:

#660 (comment)

C. Titus Brown, ctbrown@ucdavis.edu

@ctb
Copy link
Member

ctb commented Feb 15, 2015

@wking
Copy link

wking commented Mar 27, 2015

@ctb just called this issue out on @swc's discuss@ list, so I thought I'd share another interesting point. In it's GitHub chapter, the v2 Pro Git book has a section (“Pull Requests as Patches”, just under here) that claims the rebase-or-not choice tends to break along mailing-list-based vs. PR-based projects. Personally, I like rebased branches, because I want to review the whole feature branch from scratch before merging it, and I don't want to be distracted by dead-end work and other waffling that happend while the feature branch was being discussed (but then, I came up through list-based projects). But if your reviewers are following along as a branch develops, starting the review again after each round of edits is going to be more work.

The other main impact is if you're using git blame … frequently to help understand why a certain line is the way it is. If feature brances are rebased into near-perfect, semantic commits with commit messages that summarize the discussion surrounding the change, the commits that blame finds you have a good change of being informative. On the other hand, if your first pass at a feature is as ugly as most of mine are and you don't do some history polishing on your branch, blame is more likely to land you on a commit with a message like “fix the frobnitz again (for real this time)”.

On the difficulty front, the “Rebase without tears” seemed a bit alarmist ;). git rebase --abort and the reflog have been enough for me to bail out of any sticky situations, and git rebase --interactive origin/master (or wherever you want the new branch to be based) is pretty good about holding your hand through the whole process. Of course, --interactive doesn't play well with --preserve-merges, so I usually don't bother rebasing back before the most recent merge in a branch. So it's not crazy difficult, but it's not easy-street either. I usually play this on a case-by-case basis, and try to stretch my contributors to make a clean, polished feature branch (because that's what I like). If I feel they'll end up frustrated before we as polished as I like, I usually either merge whatever we have, or offer to step in and polish things up for them. If I do any polishing, I push my polished branch and my steps for the original contributor to review (so they can see how to get past whatever was sticking them), and then merge that if they're comfortable with my reroll.

So in rebase vs. not before merging, I'd say rebasing gets you a shorter history that's easier to review after the fact and has a useful blame, while not rebasing gets you a twistier history that's easy to write and easy to review if you've been paying attention the whole time.

@mr-c
Copy link
Contributor Author

mr-c commented Mar 30, 2015

@wking Thank you for sharing, I am +100 on this perspective but still unsure as to how to enshrine into policy.

@wking
Copy link

wking commented Mar 30, 2015

On Mon, Mar 30, 2015 at 04:56:37AM -0700, Michael R. Crusoe wrote:

I am +100 on this perspective but still unsure as to how to enshrine
into policy.

If the maintainers feel like they follow all the in-flight PRs as they
develop, and won't need to go back and re-review previously submitted
but unmerged commits, then:

Please don't reroll your pull requests, because we're following them
as we develop. We don't want to look through those changes again,
so please make new commits for any changes to your original
submission.

If the maintainers (like me) want to re-review a rerolled feature
branch from scratch, then you can use something like Git's suggestions
1 to get something like this for the end of the current docs 2:

After submitting a pull request, the usual worflow is:

  1. Get feedback on your submission (usually in comments or pull
    requests against your branch).

  2. Polish and refine your branch based on that feedback (notes on
    how to do this here) and force push your branch to
    update the pull request. Go back to step 1.

    At some point in that cycle,

  3. The maintainers decide that the patch series is good enough, and
    they merge it.

One other feature of a rerolly workflow is that you're more likely to
notice when you've bitten off too much and should split your pull
request into smaller pieces. For example, #788 had a little detour
into SeqAn headers that looks like it could have stood alone as it's
own pull request (the opaque pointer business anyway). Decomposing
complicated pull requests into a series of small, independent pull
requests makes review easier by limiting the changes a reviewer has to
focus on at once. Once the SeqAn-header PR landed, you'd rebase the
rest of #788 on top of the new master and get a smaller set of changes
to review there. If it was still too large a PR for efficient review,
you could split off another small PR. This works for almost all
cases, with the exception being major rearchitecting, where there are
lots of changes but no easy way to split them up into subsets.

@ctb
Copy link
Member

ctb commented Mar 31, 2015

On Mon, Mar 30, 2015 at 04:56:37AM -0700, Michael R. Crusoe wrote:

@wking Thank you for sharing, I am +100 on this perspective but still unsure as to how to enshrine into policy.

I'm -1 on requiring it of new contributors - I don't want to add to what we
force new contributors to do, as it will simply confuse them.

I'm currently -0 on allowing it on open pull requests. It has been the cause
of several merge conflicts when I've tried to update in-progress branches from
@luizirber and @mr-c that have already been merged once into my own branches.
This has also caused trouble for @camillescott and @mr-c, too.

I honestly don't see a strong reason to complicating our lives.

@wking
Copy link

wking commented Mar 31, 2015

On Tue, Mar 31, 2015 at 03:41:03AM -0700, C. Titus Brown wrote:

I'm -1 on requiring it of new contributors - I don't want to add to
what we force new contributors to do, as it will simply confuse
them.

I agree, which is why you want a “good enough” safety valve (step 3 in
1).

I'm currently -0 on allowing it on open pull requests. It has
been the cause of several merge conflicts when I've tried to update
in-progress branches from @luizirber and @mr-c that have already
been merged once into my own branches.

I think the solution here is more clarity on when portions of unmerged
PRs are being consumed by other branches. When you merge someone
else's open PR into your own branch, let them know so they don't
rebase before what you merged (or so they can tell you “don't do that,
I'm planning on rebasing the history you just merged into your
branch”). For more details, see the “don't pee in the pool” section
of Justin Hileman's “How to Git pretty” 2.

I honestly don't see a strong reason to complicating our lives.

As you gradually train your contributors to polish and reroll their
pull requests before the initial review rounds, your life as a
reviewer should get easier (and less complicated). But there is
certainly an initial training cost to that training. I think
encouraging rerolls with an easy opt-out gives you the best of both
worlds: polished PRs when the maintainer and submitter want to go
through that process, and raw PRs when the maintainer and submitter
don't feel like polishing.

 through
 https://presentate.com/bobthecow/talks/changing-history#slide-54

@ctb
Copy link
Member

ctb commented May 8, 2015

Thought about it a lot. Tentative decisions:

  • on open PRs, rebasing is OK when done by the author;
  • however, people incorporating someone else's PR into their own branch may request that no more rebases should be performed for a time, and this should be honored in the main;
  • @mr-c and others may request or suggest that a rebase be done when a branch is ready to be merged into master, but the requester is responsible for doing the rebase.

@wking
Copy link

wking commented May 8, 2015

On Fri, May 08, 2015 at 10:09:44AM -0700, C. Titus Brown wrote:

Thought about it a lot. Tentative decisions…

These sound good to me. My only suggestion would be to drop “but the
requester is responsible for doing the rebase”, since it doesn't
really matter who does the rebase. I'd leave it up to the
requester/suggester to nominate someone on a per-request/suggestion
basis. For example:

This needs to be rebased onto the current master to resolve
conflicts with #123. Let me know when you've done that, or if you
need any help doing it.

or:

This could be rebased and squashed into more compact pull request,
but if you don't feel like handling that let me know and I'll just
merge it as-is.

or:

This could be rebased and squashed into more compact pull request.
I've done that in tk/foobar-frobnitz. Let me know if that branch
looks good to you and I'll merge it.

In each case, I think it's fairly clear who is expected to do (or has
already done) the rebase. And in my own work I've had times when each
approach seemed like the best fit for that particular
branch/submitter.

@ctb
Copy link
Member

ctb commented May 8, 2015

Suggested language is good, thanks - I basically want to avoid clubbing our contributors into yet more technically demanding requirements for a merge!

@mr-c
Copy link
Contributor Author

mr-c commented May 8, 2015

+1000. If people are incorporating another's PR then heads up should be given so the that source PR can rebase/squash prior.

Thank you @wking for all the feedback!

@ctb
Copy link
Member

ctb commented May 11, 2015

On Fri, May 08, 2015 at 01:27:17PM -0700, Michael R. Crusoe wrote:

+1000. If people are incorporating another's PR then heads up should be given so the that source PR can rebase/squash prior.

This isn't necessarily how research works :).

+1 on "should" and -1 on "must".

@ctb
Copy link
Member

ctb commented May 19, 2015

See #1013 for a good method of squashing commits!

@mr-c mr-c added this to the 2.0 milestone Jul 30, 2015
@mr-c mr-c self-assigned this Jul 30, 2015
@mr-c
Copy link
Contributor Author

mr-c commented Aug 11, 2015

Fixed in #1222

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants