Improve performance of fscache #1926

benpeart · 2018-11-12T20:18:05Z

This patch series encompasses a set of changes to the fscache that underlies git's lstat() and opendir(). In addition to some additional tracing, the fscache is enhanced to reduce thread contention and take advantage of the new mem_pool heap manager. The net result is ~25% reduction to preload_index() vs the old fscache.

compat/win32/fscache.c

+ * splitting up the cache entries across multiple threads so there isn't
+ * any overlap between threads anyway.
+ */
+struct fscache {


dscho · 2018-11-14T15:57:20Z

@benpeart have you seen these failures?

 expecting success: 

	git init fscache-test &&
	cd fscache-test &&
	git config core.fscache 1 &&
	echo A > test.txt &&
	git add test.txt &&
	git commit -m A &&
	echo B >> test.txt &&
	git checkout . &&
	test -z "$(git status -s)" &&
	echo A > expect.txt &&
	test_cmp expect.txt test.txt &&
	cd .. &&
	rm -rf fscache-test

++ git init fscache-test
Initialized empty Git repository in D:/a/1/s/t/trash directory.t7201-co/fscache-test/.git/
++ cd fscache-test
++ git config core.fscache 1
++ echo A
++ git add test.txt
++ git commit -m A
[master (root-commit) 13452ee] A
 Author: A U Thor <author@example.com>
 1 file changed, 1 insertion(+)
 create mode 100644 test.txt
++ echo B
++ git checkout .
./test-lib.sh: line 707:  5544 Segmentation fault      git checkout .
error: last command exited with $?=139

I also see a build failure e.g. with linux-gcc:

2018-11-12T21:56:01.5215485Z preload-index.c: In function ‘preload_index’:
2018-11-12T21:56:01.5219128Z preload-index.c:88:30: error: expected expression before ‘;’ token
2018-11-12T21:56:01.5219628Z   fscache = getcache_fscache();

dscho

If you don't beat me to it, I will investigate further why the mem_pool is already discarded when we try to use it to allocate memory.

git-compat-util.h

compat/win32/fscache.c

dscho

Two more things, to let Linux32 build the code. The failure in the Windows job is a bogus one: t3305's scratch directory could not be removed, for whatever reason. The tests passed, though.

mem-pool.c

Add tracing around initializing and discarding mempools. In discard report on the amount of memory unused in the current block to help tune setting the initial_size. Signed-off-by: Ben Peart <benpeart@microsoft.com>

Add cache hit/miss statistics to the fscache for lstat() and opendir(). The statistics are printed out when the cache is disabled and cleared and only if GIT_TRACE_FSCACHE is set. Signed-off-by: Ben Peart <benpeart@microsoft.com>

Update enable_fscache() to take an optional initial size parameter which is used to initialize the hashmap so that it can avoid having to rehash as additional entries are added. Add a separate disable_fscache() macro to make the code clearer and easier to read. Signed-off-by: Ben Peart <benpeart@microsoft.com>

The threading model for fscache has been to have a single, global cache. This puts requirements on it to be thread safe so that callers like preload-index can call it from multiple threads. This was implemented with a single mutex and completion events which introduces contention between the calling threads. Simplify the threading model by making fscache thread specific. This allows us to remove the global mutex and synchronization events entirely and instead associate a fscache with every thread that requests one. This works well with the current multi-threading which divides the cache entries into blocks with a separate thread processing each block. At the end of each worker thread, if there is a fscache on the primary thread, merge the cached results from the worker into the primary thread cache. This enables us to reuse the cache later especially when scanning for untracked files. In testing, this reduced the time spent in preload_index() by about 25% and also reduced the CPU utilization significantly. On a repo with ~200K files, it reduced overall status times by ~12%. Signed-off-by: Ben Peart <benpeart@microsoft.com>

Now that the fscache is single threaded, take advantage of the mem_pool as the allocator to significantly reduce the cost of allocations and frees. With the reduced cost of free, in future patches, we can start freeing the fscache at the end of commands instead of just leaking it. Signed-off-by: Ben Peart <benpeart@microsoft.com>

The FSCache feature [was optimized to become faster](git-for-windows/git#1926). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

siprbaum reviewed Nov 13, 2018

View reviewed changes

compat/win32/fscache.c

* splitting up the cache entries across multiple threads so there isn't

* any overlap between threads anyway.

*/

struct fscache {

This comment was marked as off-topic.

Sign in to view

dscho requested changes Nov 14, 2018

View reviewed changes

git-compat-util.h Outdated Show resolved Hide resolved

compat/win32/fscache.c Show resolved Hide resolved

dscho requested changes Nov 15, 2018

View reviewed changes

mem-pool.c Outdated Show resolved Hide resolved

mem-pool.c Outdated Show resolved Hide resolved

benpeart added 5 commits November 15, 2018 15:49

mem_pool: add GIT_TRACE_MEMPOOL support

a72ea66

Add tracing around initializing and discarding mempools. In discard report on the amount of memory unused in the current block to help tune setting the initial_size. Signed-off-by: Ben Peart <benpeart@microsoft.com>

fscache: add fscache hit statistics

9f6c1f2

Add cache hit/miss statistics to the fscache for lstat() and opendir(). The statistics are printed out when the cache is disabled and cleared and only if GIT_TRACE_FSCACHE is set. Signed-off-by: Ben Peart <benpeart@microsoft.com>

dscho approved these changes Nov 16, 2018

View reviewed changes

dscho merged commit 52fc4db into git-for-windows:master Nov 16, 2018

dscho added this to the v2.19.1(2) milestone Nov 16, 2018

weekly-digest bot mentioned this pull request Nov 18, 2018

Weekly Digest (11 November, 2018 - 18 November, 2018) #1935

Closed

dscho added a commit to git-for-windows/build-extra that referenced this pull request Nov 20, 2018

Mention New Feature in release notes

6d0fb7f

The FSCache feature [was optimized to become faster](git-for-windows/git#1926). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

weekly-digest bot mentioned this pull request Nov 25, 2018

Weekly Digest (18 November, 2018 - 25 November, 2018) #1950

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of fscache #1926

Improve performance of fscache #1926

benpeart commented Nov 12, 2018

This comment was marked as off-topic.

dscho commented Nov 14, 2018

dscho left a comment

dscho left a comment

Improve performance of fscache #1926

Improve performance of fscache #1926

Conversation

benpeart commented Nov 12, 2018

This comment was marked as off-topic.

dscho commented Nov 14, 2018

dscho left a comment

Choose a reason for hiding this comment

dscho left a comment

Choose a reason for hiding this comment