Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster multiple file staging #629

Merged
merged 6 commits into from
Jan 12, 2020

Conversation

lapfelix
Copy link
Contributor

Staging/unstaging/discarding multiple files takes a long time. Seems to be O(N) where N is multiplied by 1-2 seconds, depending on the repo.

I fixed it by only calling -[GCRepository writeRepositoryIndex:error:] once when doing staging/discarding operations on multiple files by adding methods for these operations that take multiple files instead of one.

I tested the changes in a big repository (with thousands of commits, ~50 branches and ~20-30 total submodules) and the time it takes to stage/unstage 19 files went from ~25.5 seconds to ~2.5 seconds. The more files are modified, the bigger the performance increase is.

I AGREE TO THE GITUP CONTRIBUTOR LICENSE AGREEMENT

@lucasderraugh
Copy link
Collaborator

Absolutely wonderful! I’m out for Christmas but I will gladly review this when I’m back on the 27th. 🍻

Copy link
Collaborator

@lucasderraugh lucasderraugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me. A few points to address but overall I'm excited about this change 😃.

GitUpKit/Extensions/GCRepository+Index.m Outdated Show resolved Hide resolved
GitUpKit/Extensions/GCRepository+Index.m Outdated Show resolved Hide resolved
GitUpKit/Extensions/GCRepository+Index.m Outdated Show resolved Hide resolved
GitUpKit/Utilities/GIViewController+Utilities.h Outdated Show resolved Hide resolved
GitUpKit/Views/GIAdvancedCommitViewController.m Outdated Show resolved Hide resolved
GitUpKit/Views/GIAdvancedCommitViewController.m Outdated Show resolved Hide resolved
Copy link
Collaborator

@lucasderraugh lucasderraugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, meant to request changes.

@lapfelix lapfelix requested a review from lucasderraugh January 2, 2020 17:17
Copy link
Collaborator

@lucasderraugh lucasderraugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I'll give it a spin for performance soon and then merge it in.

Copy link
Collaborator

@lucasderraugh lucasderraugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, just one last thing to fix. I think we need to be freeing that malloced region before this function returns and it doesn't have to be + 1 count worth of space.

GitUpKit/Core/GCIndex.m Outdated Show resolved Hide resolved
The selectedFiles array would always be empty because the `else` should've been outside the `if (submodule)`
@lapfelix lapfelix requested a review from lucasderraugh January 7, 2020 05:18
@lapfelix
Copy link
Contributor Author

lapfelix commented Jan 7, 2020

I've pushed the requested change and fixed a bug in the code that calls checkoutFilesToWorkingDirectory (where it never sent the multiple selected file delta paths) and added comments to make it easier to understand for the next person reading it.

@lucasderraugh lucasderraugh merged commit 3f5abaf into git-up:master Jan 12, 2020
@lucasderraugh
Copy link
Collaborator

The speed boost is fantastic. Thanks for this!

@lucasderraugh
Copy link
Collaborator

@lapfelix I ended up reverting this change unfortunately. There are some major issues with discarding selected files that need resolution.

@lapfelix
Copy link
Contributor Author

Oh no! Do you have any steps to reproduce the issue?

@lapfelix
Copy link
Contributor Author

Okay I see the linked issue

@lucasderraugh
Copy link
Collaborator

Ya, my repro steps were just selecting a few files and hitting the delete key to discard. I had cases where it would not notify about the change, but I also noticed that sometimes it would discard things I didn't select, so it had to be reverted for that reason.


NSError* error;
if (![self discardAllChangesForFiles:selectedFiles resetIndex:NO error:&error]) {
[self.repository notifyWorkingDirectoryChanged];
Copy link
Collaborator

@lucasderraugh lucasderraugh Jan 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this was the issue around why it didn't notify when discarding files. This code would only notify on failures basically. The other issue where it would discard things I didn't select was the most concerning however and I'm not sure what the cause of that was.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh yeah the notifyWorkingDirectoryChanged and makeFirstResponder lines definitely shouldn't be in that if.

I'll review the rest of the code and see if I can reproduce that otgher issue where more things were discarded than necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really can't reproduce the other issue (with these lines placed at the right place). I wonder if it might be related to your other issue where the wrong files were discarded though. I'll revert my fix and see if I can make it happen like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okaaaaayyyy I've reproduced it. It's bad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the issue is in the call to the libgit2 function git_checkout_index. Right before it's called, the repo is fine (the right files have been deleted) but it seems like it resets too many files.

@lucasderraugh
Copy link
Collaborator

Ya, perhaps it’s a bug on their end? We’ll have to update libgit at some point.

@lapfelix
Copy link
Contributor Author

@lucasderraugh Hmmmm I have a fix which is to remove the GIT_CHECKOUT_DISABLE_PATHSPEC_MATCH option in checkoutFilesToWorkingDirectory: but I'll have to figure out why it was there in the first place

@lapfelix
Copy link
Contributor Author

@lucasderraugh I've opened a PR with the changes I'm testing: #639

@lucasderraugh
Copy link
Collaborator

Definitely want to know what that flag does. Thanks for looking into it though!

@lapfelix
Copy link
Contributor Author

One thing we could do to work around this apparent (?) bug is remove the flag and escape the paths we pass in so file names aren't recognized as match patterns. I've tried it and it works. I'll try it with filenames that might get recognized as match patterns before committing this change.

@lucasderraugh
Copy link
Collaborator

I'll have to look up what this match pattern stuff means 😊. One other concern I have that I'd like to test out is what happens on failure? Previously just 1 file was skipped over, but in this case does every file fail to get merged? I'm thinking of the discard case where perhaps you have submodules and you try to discard them and a few other files but one thing fails so the entire discard fails.

@lapfelix
Copy link
Contributor Author

@lucasderraugh Submodules are still handled individually so if there's an error discarding a submodule, it will happen before we batch-process the files so we keep the same behaviour in this specific file. It will abort once it hits an error discarding a submodule.

When discarding changes in multiple files (deleting them then doing a checkout), if there's an error when deleting the file (not sure how often this happens, but it could), it will stop deleting files and will abort, present an error and stay where it was (which means the files that were already successfully deleted will stay deleted). So in that case, we won't do the checkout for these already deleted files.

If it successfully deletes the files and then fails when doing the batch-checkout, I'm not sure what will happen. I guess it will fail for all of them, yes. I haven't encountered that yet.

lapfelix added a commit to lapfelix/GitUp that referenced this pull request Jan 27, 2020
simpzan pushed a commit to simpzan/GitUp that referenced this pull request Oct 22, 2020
* Improve performance when staging/unstaging multiple files at once

* Optimize multiple files discarding

* Run ClangFormat

* Changes following PR feedback

* Fix bug caused by bad merge and add comments

The selectedFiles array would always be empty because the `else` should've been outside the `if (submodule)`

* Correctly allocate and free multiple path strings
simpzan pushed a commit to simpzan/GitUp that referenced this pull request Oct 22, 2020
lucasderraugh pushed a commit that referenced this pull request Aug 15, 2022
* Revert "Revert "Faster multiple file staging (#629)""

683f6de

* Fix issue where repository wasn't notified of changes

* Remove `GIT_CHECKOUT_DISABLE_PATHSPEC_MATCH` as checkout strategy option

* Escape paths to prevent them from being recognized as match patterns
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants