Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checkout.c: use fscache stat only in usable place #1443

Closed
wants to merge 1 commit into from

Conversation

atetubou
Copy link

This is fix for #1438, #1442
This patch introduces explicitly enabled fscache_lstat and
fix the issue stat unintendedly returned from cache.

I replace lstat with fscache_lstat in check_path called for every file in a repository.

When we run git checkout, following two conditions are hold.

  • check_path is only called to get stat of file in working tree before checked out.
  • check_path is not called for the file after it is checked out.

So cached stat when directory traversing does not affect returned value in check_path here.

Signed-off-by: Takuto Ikuta tikuta@chromium.org

@@ -359,7 +359,7 @@ static int checkout_paths(const struct checkout_opts *opts,
state.istate = &the_index;

enable_delayed_checkout(&state);
enable_fscache(1);
enable_explicit_fscache(1);

This comment was marked as off-topic.

This comment was marked as off-topic.

@atetubou atetubou force-pushed the fix_checkout branch 2 times, most recently from c8c9e76 to 8e46ee3 Compare January 21, 2018 14:02
@@ -22,6 +22,7 @@
#include "resolve-undo.h"
#include "submodule-config.h"
#include "submodule.h"
#include "fscache.h"

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@@ -395,7 +395,7 @@ static int check_path(const char *path, int len, struct stat *st, int skiplen)
errno = ENOENT;
return -1;
}
return lstat(path, st);
return fscache_lstat(path, st);

This comment was marked as off-topic.

This comment was marked as off-topic.

@atetubou atetubou force-pushed the fix_checkout branch 3 times, most recently from 2a76b6a to c7d9fcf Compare January 21, 2018 14:53
This is fix for git-for-windows#1438, git-for-windows#1442
This patch introduces explicitly enabled fscache_lstat and
fix the issue stat unintendedly returned from cache.

I replace lstat with fscache_lstat in check_path called for every file in a repository.

When we run `git checkout`, following two conditions are hold.
* check_path is only called to get stat of file in working tree before checked out.
* check_path is not called for the file after it is checked out.

So cached stat when directory traversing does not affect returned value in check_path here.

Signed-off-by: Takuto Ikuta <tikuta@chromium.org>
@orgads
Copy link

orgads commented Jan 21, 2018

👍. Thanks!

@dscho
Copy link
Member

dscho commented Jan 22, 2018

Can't we introduce a fscache_flush() to repair this? I am wary of littering the code with fscache references...

@dscho
Copy link
Member

dscho commented Jan 22, 2018

I opted for the safe (if slow) way of reverting the original patch instead, with some information in the commit message and marked up as a fixup! commit. The test suite is still running, but once it passes, I will release Git for Windows v2.16.1.

dscho added a commit that referenced this pull request Jan 22, 2018
The recently introduced speedup of `git checkout` sadly introduced a
regression: after writing a file, the cached stat data is obviously
incorrect, yet we still used it!

This patch simply reverts the original speedup (as opposed to try to
hotfix it as #1443 tries to
do), and marks it as a fixup commit so that the next merging-rebase will
drop the patch.

We will most likely want to speed up this code path, though, but may
find a better way, e.g. by invalidating the cached data cleverly. (This
will be a *little* tricky, though, as readdir() uses the data, and we
must ensure that we do not pull it under readdir()'s rag when
invalidating it, and we also must avoid re-caching for every written
file, as that would be *even* slower than fscache=false.)

This fixes #1438 and
#1442

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho added this to the v2.16.0(3) milestone Jan 22, 2018
@dscho
Copy link
Member

dscho commented Jan 22, 2018

As announced a couple of minutes ago, I decided to go for a revert instead.

I am still very much in favor of pursuing a speedup via fscache in this code path, though!

@dscho dscho closed this Jan 22, 2018
@atetubou
Copy link
Author

Thank you for revert.

After breaking this largely used software, I'm afraid implicit usage of cached result.
It is easy to find slow/too many called lstat and confirm that lstat is called to know the file stat before git command run.
But I could easily fail to notice code paths that can be called after writing disk in duration fscache enabled. If such code paths can be detected completely, inserting fscache_flush() after that will work.

So I think explicit usage is safer and solid now.

@orgads
Copy link

orgads commented Jan 22, 2018

I just noticed that with this fix git rebase with autostash fails. @dscho: It's good that you preferred to revert. @atetubou: Thanks for the initial patch and this fix. Take your time to improve it. I'll be really glad to use it when it works.

@atetubou
Copy link
Author

@orgads git rebase with autostash calls git reset --hard internally.
Such failure is fixed by this revert e76e5c4

If you see the issue even with e76e5c4 and this PR, I missed something yet.

@orgads
Copy link

orgads commented Jan 23, 2018

I didn't have this patch applied. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants