Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the performance of pull-request merging #641

Closed
wants to merge 2 commits into from

Conversation

typeless
Copy link
Contributor

@typeless typeless commented Jan 11, 2017

Merging pull-request without checking out a working tree

Inspired by this.

For #601

@lunny lunny added this to the 1.1.0 milestone Jan 11, 2017
@lunny lunny added the type/enhancement An improvement of existing functionality label Jan 11, 2017
@appleboy
Copy link
Member

Lint error, please fix lint error from the following command.

make lint

@tboerger tboerger added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Jan 11, 2017
@typeless
Copy link
Contributor Author

@appleboy I have fixed it in the last commit and tested with make lint for my local repo.

@appleboy
Copy link
Member

@typeless Where is your last commit? Still fail: https://drone.gitea.io/go-gitea/gitea/1072

@typeless
Copy link
Contributor Author

@appleboy 0e58fa9

Am I missing something?

@appleboy
Copy link
Member

@typeless I don't know why the drone show error on this commit. Please recommit to test it again.

@typeless
Copy link
Contributor Author

@appleboy Alright

@Bwko
Copy link
Member

Bwko commented Jan 11, 2017

@typeless Do you have any benchmarks? I'm using the latest master(without this PR) and merging a repo of 1.7gig succeeds in about 5 secs 😯

@typeless
Copy link
Contributor Author

typeless commented Jan 12, 2017

@Bwko For the record, the large repo that troubled me is about 13G the size of .git.

$ time git clone ../git-repositories/foo/my-large-repo.git/
Cloning into 'my-large-repo'...
done.
Checking out files: 100% (530225/530225), done.

real    5m58.914s
user    1m23.368s
sys     0m31.164s

The large repo is in fact an Android board support package which is a mixture of the Google official code and a vendor-tailored distribution.

And for Linux kernel, which is about 2G (bare repo)

$ time git clone gitea-repositories/foo/linux.git
Cloning into 'linux'...
done.
Checking out files: 100% (56233/56233), done.

real    0m6.242s
user    0m4.680s
sys     0m1.216s

Very close to what you have benchmarked.

The both clones were done within the same filesystem, by the way.

Edit:

For downloading the official android platform code:

$ curl https://storage.googleapis.com/git-repo-downloads/repo > ~/bin/repo
$ chmod a+x ~/bin/repo

Then use the repo tool to download the code

$ mkdir large-repo
$ cd large-repo
$ repo init -u https://android.googlesource.com/platform/manifest -b android-4.0.1_r1
$ repo sync -j32 

The -j32 is similar to make -j, assuming your machine has 32 cores.
Be aware that this is a bandwidth killer. :trollface:

It'd better to kill the .repo and the sparsely populated .git directories before pushing to Gitea.

$ rm -rf .repo
$ find -name .git -type d -exec rm -rf {} +

References: https://source.android.com/source/downloading.html

@Bwko
Copy link
Member

Bwko commented Jan 12, 2017

@typeless That looks really promising 👍
Found a bug, on your forked repo click new file, fill that file with contents use the Create a new branch for this commit and start a pull request. option.
An then create a PR, merging gives the following error:

[...routers/repo/pull.go:420 MergePullRequest()] [E] Merge: git commit-tree a9405d5ce9265ff2b5647b1f77ade6f44b6c419e -p master -p refs/pull/5/head: 
*** Please tell me who you are.

Run

  git config --global user.email "you@example.com"
  git config --global user.name "Your Name"

to set your account's default identity.
Omit --global to set the identity only in this repository.

@typeless
Copy link
Contributor Author

Hasn't been tested yet :trollface:

@bkcsoft bkcsoft added the pr/wip This PR is not ready for review label Jan 21, 2017
@Bwko
Copy link
Member

Bwko commented Jan 25, 2017

@typeless If you repeat the steps from my previous comment you'll get: fatal: invalid date format: RFC9999

@typeless
Copy link
Contributor Author

@Bwko Thanks for your help.
I don't know if there is a better way but I would like to add an automated testing facility like how #741 does. I don't like to re-create those fake repos and PRs from the ground up again and again 😆

@lunny
Copy link
Member

lunny commented Feb 11, 2017

conflicted

@lunny
Copy link
Member

lunny commented Feb 14, 2017

@typeless any update?

@typeless typeless force-pushed the speedup-pr-merge branch 3 times, most recently from 4703f84 to cd29b89 Compare February 21, 2017 09:04
@typeless
Copy link
Contributor Author

typeless commented Feb 21, 2017

@Bwko @metalmatze
Rebased and manually resolved conflicts. PTAL.

P.S. The relavent commits: bf647ce and d100615

@lunny
Copy link
Member

lunny commented Feb 21, 2017

It's ready for review now?

@typeless
Copy link
Contributor Author

@lunny I'd like to wait for #741 getting merged so that I can add an end-to-end test for this. For now, I just want to ping the related authors about what the PR does to their commits.

models/pull.go Outdated
fmt.Sprintf("PullRequest.Merge (git write-tree): %s", baseRepoPath),
[]string{"GIT_DIR=" + baseRepoPath, "GIT_INDEX_FILE=" + indexTmpPath},
"git", "write-tree"); err != nil {
return fmt.Errorf("git write-tree: %s", stderr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err also should be include the error string.

@lunny
Copy link
Member

lunny commented Mar 6, 2017

So let's move this to v1.2 since we have no more time to review this one after #741 merged.

@lunny lunny modified the milestones: 1.2.0, 1.1.0 Mar 6, 2017
@appleboy
Copy link
Member

@typeless Great work. build success. https://drone.gitea.io/go-gitea/gitea/3579

@lunny
Copy link
Member

lunny commented Jun 16, 2017

Any performance tests?

@typeless
Copy link
Contributor Author

@lunny I was considering about it. But for showing any significance we need to prepare a large repo in the integration framework. I have no solid idea yet about how it should be added.

@lunny
Copy link
Member

lunny commented Jun 16, 2017

maybe clone linux repo on the tests?

@typeless
Copy link
Contributor Author

@lunny maybe we can use the migration feature to clone a Linux repository for testing. But I am afraid of getting banned if we suck the bandwidth of the server too heavily/frequently. 😄

@lafriks
Copy link
Member

lafriks commented Jun 16, 2017

I don't think performance tests should be done in drone

@sapk
Copy link
Member

sapk commented Jun 16, 2017

We could still have a benchmark suite at least to validate any performance PR. Those should not be run on drone even more if they are network/cpu/memory intensive.

@typeless typeless changed the title Improve the performance of pull-request merging [WIP] Improve the performance of pull-request merging Jun 19, 2017
@typeless typeless changed the title [WIP] Improve the performance of pull-request merging Improve the performance of pull-request merging Jun 19, 2017
@typeless typeless force-pushed the speedup-pr-merge branch 2 times, most recently from 7051e60 to 7f92154 Compare June 23, 2017 08:51
@beddari
Copy link

beddari commented Sep 15, 2017

Any status updates?

@codecov-io
Copy link

codecov-io commented Sep 18, 2017

Codecov Report

Merging #641 into master will decrease coverage by 0.06%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #641      +/-   ##
==========================================
- Coverage   27.32%   27.25%   -0.07%     
==========================================
  Files          86       86              
  Lines       17135    17177      +42     
==========================================
  Hits         4682     4682              
- Misses      11775    11817      +42     
  Partials      678      678
Impacted Files Coverage Δ
models/pull.go 19.53% <0%> (-1.05%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ca68a75...c5babc2. Read the comment docs.

@lunny
Copy link
Member

lunny commented Sep 18, 2017

@typeless Is it possible to add a performance test?

@typeless
Copy link
Contributor Author

I was thinking about it but not sure about how it should be done in an idiomatic way. Also the merge operation is stateful and not easily repeatable in a typical for i:=0; i<b.N; i++ {} loop. Any ideas? Maybe I am missing something.

@lunny
Copy link
Member

lunny commented Sep 19, 2017

models/pull.go Outdated
"GIT_AUTHOR_EMAIL=" + headCommit.Author.Email,
"GIT_AUTHOR_DATE=" + headCommit.Author.When.Format("Mon, 02 Jan 2006 15:04:05 -0700"),
"GIT_COMMITTER_NAME=" + doer.NewGitSig().Name,
"GIT_COMMITTER_EMAIL=" + doer.NewGitSig().Email,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this hide the Email if the commiter has that set in their config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@typeless
Copy link
Contributor Author

A benchmark has been added.

The old way of PR merging has to checkout the working tree of the repository.
This change make it possible to do the merge by manipulating the index
only.
"git", "clone", baseGitRepo.Path, tmpBasePath); err != nil {
return fmt.Errorf("git clone: %s", stderr)
// A temporary Git working tree for git-merge-index and git-merge-one-file.
workTreeTmpPath, err := ioutil.TempDir("", "gitea-merge-")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is not there missing +pr.BaseRepo.Name+"-"+strconv.Itoa(time.Now().Nanosecond())?

Copy link
Contributor Author

@typeless typeless Sep 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ioutil.TempDir will generate a suffix automatically. See https://golang.org/src/io/ioutil/tempfile.go?s=2138:2195#L66
Sorry, I see what you mean now. I'll fix it later.

Copy link
Contributor Author

@typeless typeless Sep 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is not there missing +pr.BaseRepo.Name+"-"+strconv.Itoa(time.Now().Nanosecond())?

This is actually correct despite a bit of inconsistent. I will clean this up a little bit.

The path indexTmpPath and workTreeTmpPath conrespond to GIT_INDEX_FILE and GIT_WORK_TREE respectively. I looked at this in a hurry last friday and got confused as well. 😆

@typeless
Copy link
Contributor Author

typeless commented Oct 5, 2017

This is actually not a comprehensive approach that covers all cases where git-merge can handle. So I think the change doesn't warrant the risks. The change should be made into the git tool or delegated to `libgit2' or src-d/go-git.

@typeless typeless closed this Oct 5, 2017
@typeless typeless deleted the speedup-pr-merge branch April 3, 2019 05:01
@go-gitea go-gitea locked and limited conversation to collaborators Nov 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. type/enhancement An improvement of existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.