-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fscache: add not-found directory cache to fscache #994
fscache: add not-found directory cache to fscache #994
Conversation
e6e29d2
to
d32841b
Compare
Ooh, this should be a pretty decent performance improvement when using sparse. 😁 |
@@ -6,8 +6,83 @@ | |||
static int initialized; | |||
static volatile long enabled; | |||
static struct hashmap map; | |||
static struct hashmap map_nfd; /* not found directories */ |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
||
static struct nfd_entry *nfd_alloc(const char *name, size_t namelen, unsigned int hash) | ||
{ | ||
struct nfd_entry *nfd = xcalloc(1, sizeof(struct nfd_entry) + namelen + 1); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
@dscho I pulled your fixup and added one more. |
@jeffhostetler okay, good. Please note that the Other than that, I think the only thing we may want to consider is to try our hand at a test that verifies somehow that non-existing directories are not accessed more than once. We could introduce a new What do you think? |
Brilliant. The more optional tracing we enable, the easier debugging will be in the future. |
@dscho Yeah. I'll re-title the commit and look at adding the tracing. I'll try to fix this up next week. Thanks! |
Well, we should not overdo it, in particular in performance-critical code such as FSCache... 😄 |
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach FSCACHE to remember "not found" directories. This is a performance optimization. FSCACHE is a performance optimization available for Windows. It intercepts Posix-style lstat() calls into an in-memory directory using FindFirst/FindNext. It improves performance on Windows by catching the first lstat() call in a directory, using FindFirst/ FindNext to read the list of files (and attribute data) for the entire directory into the cache, and short-cut subsequent lstat() calls in the same directory. This gives a major performance boost on Windows. However, it does not remember "not found" directories. When STATUS runs and there are missing directories, the lstat() interception fails to find the parent directory and simply return ENOENT for the file -- it does not remember that the FindFirst on the directory failed. Thus subsequent lstat() calls in the same directory, each re-attempt the FindFirst. This completely defeats any performance gains. This can be seen by doing a sparse-checkout on a large repo and then doing a read-tree to reset the skip-worktree bits and then running status. This change reduced status times for my very large repo by 60%. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
1f78c19
to
8544ec3
Compare
In this version, I added a simple GIT_TRACE_FSCACHE key to log each time FindFirst fails. |
Excellent! I added a test that I merged, too (10b99b6). |
Performance [was enhanced when using fscache in a massively sparse checkout](git-for-windows/git#994). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…er/fscache_nfd fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
…er/fscache_nfd fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
…er/fscache_nfd fscache: add not-found directory cache to fscache
fscache: add not-found directory cache to fscache
Teach FSCACHE to remember "not found" directories.
This is a performance optimization.
FSCACHE is a performance optimization available for Windows. It
intercepts Posix-style lstat() calls into an in-memory directory
using FindFirst/FindNext. It improves performance on Windows by
catching the first lstat() call in a directory, using FindFirst/
FindNext to read the list of files (and attribute data) for the
entire directory into the cache, and short-cut subsequent lstat()
calls in the same directory. This gives a major performance
boost on Windows.
However, it does not remember "not found" directories. When STATUS
runs and there are missing directories, the lstat() interception
fails to find the parent directory and simply return ENOENT for the
file -- it does not remember that the FindFirst on the directory
failed. Thus subsequent lstat() calls in the same directory, each
re-attempt the FindFirst. This completely defeats any performance
gains.
This can be seen by doing a sparse-checkout on a large repo and
then doing a read-tree to reset the skip-worktree bits and then
running status.
This change reduced status times for my very large repo by 60%.