forked from git-for-windows/git
-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TEST PR: speed up index load through parallelization #20
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Test clean-up and corrections. * es/test-fixes: (26 commits) t5608: fix broken &&-chain t9119: fix broken &&-chains t9000-t9999: fix broken &&-chains t7000-t7999: fix broken &&-chains t6000-t6999: fix broken &&-chains t5000-t5999: fix broken &&-chains t4000-t4999: fix broken &&-chains t3030: fix broken &&-chains t3000-t3999: fix broken &&-chains t2000-t2999: fix broken &&-chains t1000-t1999: fix broken &&-chains t0000-t0999: fix broken &&-chains t9814: simplify convoluted check that command correctly errors out t9001: fix broken "invoke hook" test t7810: use test_expect_code() instead of hand-rolled comparison t7400: fix broken "submodule add/reconfigure --force" test t7201: drop pointless "exit 0" at end of subshell t6036: fix broken "merge fails but has appropriate contents" tests t5505: modernize and simplify hard-to-digest test t5406: use write_script() instead of birthing shell script manually ...
"git diff --color-moved" feature has further been tweaked. * sb/diff-color-move-more: diff.c: offer config option to control ws handling in move detection diff.c: add white space mode to move detection that allows indent changes diff.c: factor advance_or_nullify out of mark_color_as_moved diff.c: decouple white space treatment from move detection algorithm diff.c: add a blocks mode for moved code detection diff.c: adjust hash function signature to match hashmap expectation diff.c: do not pass diff options as keydata to hashmap t4015: avoid git as a pipe input xdiff/xdiffi.c: remove unneeded function declarations xdiff/xdiff.h: remove unused flags
"git checkout" and "git worktree add" learned to honor checkout.defaultRemote when auto-vivifying a local branch out of a remote tracking branch in a repository with multiple remotes that have tracking branches that share the same names. * ab/checkout-default-remote: checkout & worktree: introduce checkout.defaultRemote checkout: add advice for ambiguous "checkout <branch>" builtin/checkout.c: use "ret" variable for return checkout: pass the "num_matches" up to callers checkout.c: change "unique" member to "num_matches" checkout.c: introduce an *_INIT macro checkout.h: wrap the arguments to unique_tracking_name() checkout tests: index should be clean after dwim checkout
Code restructuring and a small fix to transport protocol v2 during fetching. * jt/fetch-pack-negotiator: fetch-pack: introduce negotiator API fetch-pack: move common check and marking together fetch-pack: make negotiation-related vars local fetch-pack: use ref adv. to prune "have" sent fetch-pack: directly end negotiation if ACK ready fetch-pack: clear marks before re-marking fetch-pack: split up everything_local()
Parsing of -L[<N>][,[<M>]] parameters "git blame" and "git log" take has been tweaked. * is/parsing-line-range: log: prevent error if line range ends past end of file blame: prevent error if range ends past end of file
lookup_commit_reference() and friends have been updated to find in-core object for a specific in-core repository instance. * sb/object-store-lookup: (32 commits) commit.c: allow lookup_commit_reference to handle arbitrary repositories commit.c: allow lookup_commit_reference_gently to handle arbitrary repositories tag.c: allow deref_tag to handle arbitrary repositories object.c: allow parse_object to handle arbitrary repositories object.c: allow parse_object_buffer to handle arbitrary repositories commit.c: allow get_cached_commit_buffer to handle arbitrary repositories commit.c: allow set_commit_buffer to handle arbitrary repositories commit.c: migrate the commit buffer to the parsed object store commit-slabs: remove realloc counter outside of slab struct commit.c: allow parse_commit_buffer to handle arbitrary repositories tag: allow parse_tag_buffer to handle arbitrary repositories tag: allow lookup_tag to handle arbitrary repositories commit: allow lookup_commit to handle arbitrary repositories tree: allow lookup_tree to handle arbitrary repositories blob: allow lookup_blob to handle arbitrary repositories object: allow lookup_object to handle arbitrary repositories object: allow object_as_type to handle arbitrary repositories tag: add repository argument to deref_tag tag: add repository argument to parse_tag_buffer tag: add repository argument to lookup_tag ...
Various glitches in the heuristics of merge-recursive strategy have been documented in new tests. * en/t6042-insane-merge-rename-testcases: t6042: add testcase covering long chains of rename conflicts t6042: add testcase covering rename/rename(2to1)/delete/delete conflict t6042: add testcase covering rename/add/delete conflict type
"git fetch" learned a new option "--negotiation-tip" to limit the set of commits it tells the other end as "have", to reduce wasted bandwidth and cycles, which would be helpful when the receiving repository has a lot of refs that have little to do with the history at the remote it is fetching from. * jt/fetch-nego-tip: fetch-pack: support negotiation tip whitelist
For a large tree, the index needs to hold many cache entries allocated on heap. These cache entries are now allocated out of a dedicated memory pool to amortize malloc(3) overhead. * jm/cache-entry-from-mem-pool: block alloc: add validations around cache_entry lifecyle block alloc: allocate cache entries from mem_pool mem-pool: fill out functionality mem-pool: add life cycle management functions mem-pool: only search head block for available space block alloc: add lifecycle APIs for cache_entry structs read-cache: teach make_cache_entry to take object_id read-cache: teach refresh_cache_entry to take istate
"git gc --auto" opens file descriptors for the packfiles before spawning "git repack/prune", which would upset Windows that does not want a process to work on a file that is open by another process. The issue has been worked around. * kg/gc-auto-windows-workaround: gc --auto: release pack files before auto packing
"git grep" learned the "--only-matching" option. * tb/grep-only-matching: grep.c: teach 'git grep --only-matching' grep.c: extract show_line_header()
"git rebase --rebase-merges" mode now handles octopus merges as well. * js/rebase-merge-octopus: rebase --rebase-merges: adjust man page for octopus support rebase --rebase-merges: add support for octopus merges merge: allow reading the merge commit message from a file
The recursive merge strategy did not properly ensure there was no change between HEAD and the index before performing its operation, which has been corrected. * en/dirty-merge-fixes: merge: fix misleading pre-merge check documentation merge-recursive: enforce rule that index matches head before merging t6044: add more testcases with staged changes before a merge is invoked merge-recursive: fix assumption that head tree being merged is HEAD merge-recursive: make sure when we say we abort that we actually abort t6044: add a testcase for index matching head, when head doesn't match HEAD t6044: verify that merges expected to abort actually abort index_has_changes(): avoid assuming operating on the_index read-cache.c: move index_has_changes() from merge.c
Tests to cover various conflicting cases have been added for merge-recursive. * en/t6036-merge-recursive-tests: t6036: add a failed conflict detection case: regular files, different modes t6036: add a failed conflict detection case with conflicting types t6036: add a failed conflict detection case with submodule add/add t6036: add a failed conflict detection case with submodule modify/modify t6036: add a failed conflict detection case with symlink add/add t6036: add a failed conflict detection case with symlink modify/modify
Tests to cover conflict cases that involve submodules have been added for merge-recursive. * en/t7405-recursive-submodule-conflicts: t7405: verify 'merge --abort' works after submodule/path conflicts t7405: add a directory/submodule conflict t7405: add a file/submodule conflict
"git rebase" started exporting GIT_DIR environment variable and exposing it to hook scripts when part of it got rewritten in C. Instead of matching the old scripted Porcelains' behaviour, compensate by also exporting GIT_WORK_TREE environment as well to lessen the damage. This can harm existing hooks that want to operate on different repository, but the current behaviour is already broken for them anyway. * bc/sequencer-export-work-tree-as-well: sequencer: pass absolute GIT_WORK_TREE to exec commands
"git send-email" when using in a batched mode that limits the number of messages sent in a single SMTP session lost the contents of the variable used to choose between tls/ssl, unable to send the second and later batches, which has been fixed. * jm/send-email-tls-auth-on-batch: send-email: fix tls AUTH when sending batch
Add a server-side knob to skip commits in exponential/fibbonacci stride in an attempt to cover wider swath of history with a smaller number of iterations, potentially accepting a larger packfile transfer, instead of going back one commit a time during common ancestor discovery during the "git fetch" transaction. * jt/fetch-negotiator-skipping: negotiator/skipping: skip commits during fetch
The lazy clone support had a few places where missing but promised objects were not correctly tolerated, which have been fixed. * jt/tags-to-promised-blobs-fix: tag: don't warn if target is missing but promised revision: tolerate promised targets of tags
Look for broken "&&" chains that are hidden in subshell, many of which have been found and corrected. * es/chain-lint-in-subshell: t/chainlint.sed: drop extra spaces from regex character class t/chainlint: add chainlint "specialized" test cases t/chainlint: add chainlint "complex" test cases t/chainlint: add chainlint "cuddled" test cases t/chainlint: add chainlint "loop" and "conditional" test cases t/chainlint: add chainlint "nested subshell" test cases t/chainlint: add chainlint "one-liner" test cases t/chainlint: add chainlint "whitespace" test cases t/chainlint: add chainlint "basic" test cases t/Makefile: add machinery to check correctness of chainlint.sed t/test-lib: teach --chain-lint to detect broken &&-chains in subshells
The singleton commit-graph in-core instance is made per in-core repository instance. * jt/commit-graph-per-object-store: commit-graph: add repo arg to graph readers commit-graph: store graph in struct object_store commit-graph: add free_commit_graph commit-graph: add missing forward declaration object-store: add missing include commit-graph: refactor preparing commit graph
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option help text for the force-with-lease option to "git push" reads like this: $ git push -h 2>&1 | grep -e force-with-lease --force-with-lease[=<refname>:<expect>] which comes from having N_("refname>:<expect") as the argument help text in the source code, with an aparent lack of "<" and ">" at both ends. It turns out that parse-options machinery takes the whole string and encloses it inside a pair of "<>", to make it easier for majority cases that uses a single token placeholder. The help string was written in a funnily unbalanced way knowing that the end result would balance out, by somebody who forgot the presence of PARSE_OPT_LITERAL_ARGHELP, which is the escape hatch mechanism designed to help such a case. We just should use the official escape hatch instead. Because ":<expect>" part can be omitted to ask Git to guess, it may be more correct to spell it as "<refname>[:<expect>]", but that is not the focus of this topic. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Don't translate the argument specification for --chmod; "+x" and "-x" are the literal strings that the commands accept. Separate alternatives using a pipe character instead of a slash, for consistency. Use the flag PARSE_OPT_LITERAL_ARGHELP to prevent parseopt from adding a pair of angular brackets around the argument help string, as that would wrongly indicate that users need to replace the literal strings with some kind of value. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Parseopt wraps arguments in a pair of angular brackets by default, signifying that the user needs to replace it with a value of the documented type. Remove the pairs from the option definitions to duplication and confusion. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Wrap both placeholders in the argument help string in angular brackets to signal that users needs replace them with some actual value. Use the flag PARSE_OPT_LITERAL_ARGHELP to prevent parseopt from adding another pair. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Wrap each part of the argument help string in angular brackets to show that users need to replace them with actual values. Do that explicitly to balance the pairs nicely in the code and avoid confusing casual readers. Add the flag PARSE_OPT_LITERAL_ARGHELP to keep parseopt from adding another pair. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Wrap the placeholders in the option help string for -w in pairs of angular brackets to document that users need to replace them with actual numbers. Use the flag PARSE_OPT_LITERAL_ARGHELP to prevent parseopt from adding another pair. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Parseopt wraps argument help strings in a pair of angular brackets by default, to tell users that they need to replace it with an actual value. This is useful in most cases, because most option arguments are indeed single values of a certain type. The option PARSE_OPT_LITERAL_ARGHELP needs to be used in option definitions with arguments that have multiple parts or are literal strings. Stop adding these angular brackets if special characters are present, as they indicate that we don't deal with a simple placeholder. This simplifies the code a bit and makes defining special options slightly easier. Remove the flag PARSE_OPT_LITERAL_ARGHELP in the cases where the new and more cautious handling suffices. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
want_color_fd() is designed to work only with standard output and error file descriptors and stores information about each descriptor in an array. However, it doesn't verify that the passed-in descriptor lives within that set, which, with a buggy caller, could lead to access or assignment outside the array bounds. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test fix. * sg/t5310-empty-input-fix: t5310-pack-bitmaps: fix bogus 'pack-objects to file can use bitmap' test
Improve built-in facility to catch broken &&-chain in the tests. * es/chain-lint-more: chainlint: add test of pathological case which triggered false positive chainlint: recognize multi-line quoted strings more robustly chainlint: let here-doc and multi-line string commence on same line chainlint: recognize multi-line $(...) when command cuddled with "$(" chainlint: match 'quoted' here-doc tags chainlint: match arbitrary here-docs tags rather than hard-coded names
The more library-ish parts of the codebase learned to work on the in-core index-state instance that is passed in by their callers, instead of always working on the singleton "the_index" instance. * nd/no-the-index: (24 commits) blame.c: remove implicit dependency on the_index apply.c: remove implicit dependency on the_index apply.c: make init_apply_state() take a struct repository apply.c: pass struct apply_state to more functions resolve-undo.c: use the right index instead of the_index archive-*.c: use the right repository archive.c: avoid access to the_index grep: use the right index instead of the_index attr: remove index from git_attr_set_direction() entry.c: use the right index instead of the_index submodule.c: use the right index instead of the_index pathspec.c: use the right index instead of the_index unpack-trees: avoid the_index in verify_absent() unpack-trees: convert clear_ce_flags* to avoid the_index unpack-trees: don't shadow global var the_index unpack-trees: add a note about path invalidation unpack-trees: remove 'extern' on function declaration ls-files: correct index argument to get_convert_attr_ascii() preload-index.c: use the right index instead of the_index dir.c: remove an implicit dependency on the_index in pathspec code ...
"git tbdiff" that lets us compare individual patches in two iterations of a topic has been rewritten and made into a built-in command. * js/range-diff: (21 commits) range-diff: use dim/bold cues to improve dual color mode range-diff: make --dual-color the default mode range-diff: left-pad patch numbers completion: support `git range-diff` range-diff: populate the man page range-diff --dual-color: skip white-space warnings range-diff: offer to dual-color the diffs diff: add an internal option to dual-color diffs of diffs color: add the meta color GIT_COLOR_REVERSE range-diff: use color for the commit pairs range-diff: add tests range-diff: do not show "function names" in hunk headers range-diff: adjust the output of the commit pairs range-diff: suppress the diff headers range-diff: indent the diffs just like tbdiff range-diff: right-trim commit messages range-diff: also show the diff between patches range-diff: improve the order of the shown commits range-diff: first rudimentary implementation Introduce `range-diff` to compare iterations of a topic branch ...
"git pull --rebase -v" in a repository with a submodule barfed as an intermediate process did not understand what "-v(erbose)" flag meant, which has been fixed. * sb/pull-rebase-submodule: git-submodule.sh: accept verbose flag in cmd_update to be non-quiet
Test fix. * js/chain-lint-attrfix: chainlint: fix for core.autocrlf=true
Doc updates. * jh/partial-clone-doc: partial-clone: render design doc using asciidoc
A test prerequisite defined by various test scripts with slightly different semantics has been consolidated into a single copy and made into a lazily defined one. * wc/make-funnynames-shared-lazy-prereq: t: factor out FUNNYNAMES as shared lazy prereq
After a partial clone, repeated fetches from promisor remote would have accumulated many packfiles marked with .promisor bit without getting them coalesced into fewer packfiles, hurting performance. "git repack" now learned to repack them. * jt/repack-promisor-packs: repack: repack promisor objects if -a or -A is set repack: refactor setup of pack-objects cmd
Code hygiene improvement for the header files. * en/incl-forward-decl: Remove forward declaration of an enum compat/precompose_utf8.h: use more common include guard style urlmatch.h: fix include guard Move definition of enum branch_track from cache.h to branch.h alloc: make allocate_alloc_state and clear_alloc_state more consistent Add missing includes and forward declarations
Test updates. * ab/submodule-relative-url-tests: submodule: add more exhaustive up-path testing
Recent update to "git config" broke updating variable in a subsection, which has been corrected. * sb/config-write-fix: git-config: document accidental multi-line setting in deprecated syntax config: fix case sensitive subsection names on writing t1300: document current behavior of setting options
When "git rebase -i" is told to squash two or more commits into one, it labeled the log message for each commit with its number. It correctly called the first one "1st commit", but the next one was "commit #1", which was off-by-one. This has been corrected. * pw/rebase-i-squash-number-fix: rebase -i: fix numbering in squash message
"git rebase -i", when a 'merge <branch>' insn in its todo list fails, segfaulted, which has been (minimally) corrected. * pw/rebase-i-merge-segv-fix: rebase -i: fix SIGSEGV when 'merge <branch>' fails t3430: add conflicting commit
A few preliminary minor clean-ups in the area around submodules. * sb/submodule-cleanup: builtin/submodule--helper: remove stray new line t7410: update to new style
"git cherry-pick --quit" failed to remove CHERRY_PICK_HEAD even though we won't be in a cherry-pick session after it returns, which has been corrected. * nd/cherry-pick-quit-fix: cherry-pick: fix --quit not deleting CHERRY_PICK_HEAD
The sideband code learned to optionally paint selected keywords at the beginning of incoming lines on the receiving end. * hn/highlight-sideband-keywords: sideband: do not read beyond the end of input sideband: highlight keywords in remote sideband output
* ab/checkout-default-remote: t2024: mark test using "checkout -p" with PERL prerequisite
Signed-off-by: Junio C Hamano <gitster@pobox.com>
jamill
reviewed
Aug 22, 2018
read-cache.c
Outdated
|
||
if (istate->version == 4) { | ||
previous_name = &previous_name_buf; | ||
mem_pool_init(&istate->ce_mem_pool, 0, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mem_pool_init
code is common to both the if
and the else
branch.
In a recent update in 2.18 era, "git pack-objects" started producing a larger than necessary packfiles by missing opportunities to use large deltas. * nd/pack-deltify-regression-fix: pack-objects: fix performance issues on packing large deltas
This patch helps address the CPU cost of loading the index by creating multiple threads to divide the work of loading and converting the cache entries across all available CPU cores. It accomplishes this by having the primary thread loop across the index file tracking the offset and (for V4 indexes) expanding the name. It creates a thread to process each block of entries as it comes to them. Once the threads are complete and the cache entries are loaded, the rest of the extensions can be loaded and processed normally on the primary thread. I used p0002-read-cache.sh to generate some performance data: 100,000 entries Test HEAD~3 HEAD~2 --------------------------------------------------------------------------- read_cache/discard_cache 1000 times 14.02(0.01+0.12) 9.81(0.01+0.07) -30.0% 1,000,000 entries Test HEAD~3 HEAD~2 ------------------------------------------------------------------------------ read_cache/discard_cache 1000 times 202.06(0.06+0.09) 155.72(0.03+0.06) -22.9% Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
This patch helps address the CPU cost of loading the index by loading the cache extensions on a worker thread in parallel with loading the cache entries. This is possible because the current extensions don't access the cache entries in the index_state structure so are OK that they don't all exist yet. The CACHE_EXT_TREE, CACHE_EXT_RESOLVE_UNDO, and CACHE_EXT_UNTRACKED extensions don't even get a pointer to the index so don't have access to the cache entries. CACHE_EXT_LINK only uses the index_state to initialize the split index. CACHE_EXT_FSMONITOR only uses the index_state to save the fsmonitor last update and dirty flags. I used p0002-read-cache.sh to generate some performance data on the cumulative impact: 100,000 entries Test HEAD~3 HEAD~2 --------------------------------------------------------------------------- read_cache/discard_cache 1000 times 14.08(0.01+0.10) 9.72(0.03+0.06) -31.0% 1,000,000 entries Test HEAD~3 HEAD~2 ------------------------------------------------------------------------------ read_cache/discard_cache 1000 times 202.95(0.01+0.07) 154.14(0.03+0.06) -24.1% Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
…arsing. - strbuf_remove() in expand_name_field() is not exactly a good fit for stripping a part at the end, _setlen() would do the same job and is much cheaper. - the open-coded loop to find the end of the string in expand_name_field() can't beat an optimized strlen() I used p0002-read-cache.sh to generate some performance data on the cumulative impact: 100,000 files Test HEAD~3 HEAD --------------------------------------------------------------------------- read_cache/discard_cache 1000 times 14.08(0.03+0.09) 8.71(0.01+0.09) -38.1% 1,000,000 files Test HEAD~3 HEAD ------------------------------------------------------------------------------ read_cache/discard_cache 1000 times 201.77(0.03+0.07) 149.68(0.04+0.07) -25.8% Suggested by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
derrickstolee
pushed a commit
that referenced
this pull request
Oct 7, 2021
In a sparse index it is possible for the tree that is being verified to be freed while it is being verified. This happens when the index is sparse but the cache tree is not and index_name_pos() looks up a path from the cache tree that is a descendant of a sparse index entry. That triggers a call to ensure_full_index() which frees the cache tree that is being verified. Carrying on trying to verify the tree after this results in a use-after-free bug. Instead restart the verification if a sparse index is converted to a full index. This bug is triggered by a call to reset_head() in "git rebase --apply". Thanks to René Scharfe and Derrick Stolee for their help analyzing the problem. ==74345==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000001b20 at pc 0x557cbe82d3a2 bp 0x7ffdfee08090 sp 0x7ffdfee08080 READ of size 4 at 0x606000001b20 thread T0 #0 0x557cbe82d3a1 in verify_one /home/phil/src/git/cache-tree.c:863 #1 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 #2 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 #3 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 #4 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910 #5 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250 #6 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87 #7 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074 #8 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461 #9 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714 #10 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781 #11 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912 #12 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52 #13 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24) #14 0x557cbe5bcb8d in _start (/home/phil/src/git/git+0x1b9b8d) 0x606000001b20 is located 0 bytes inside of 56-byte region [0x606000001b20,0x606000001b58) freed by thread T0 here: #0 0x7fdd4bacff19 in __interceptor_free /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:127 #1 0x557cbe82af60 in cache_tree_free /home/phil/src/git/cache-tree.c:35 #2 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31 #3 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31 #4 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31 #5 0x557cbeb2557a in ensure_full_index /home/phil/src/git/sparse-index.c:310 #6 0x557cbea45c4a in index_name_stage_pos /home/phil/src/git/read-cache.c:588 #7 0x557cbe82ce37 in verify_one /home/phil/src/git/cache-tree.c:850 #8 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 #9 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 #10 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 #11 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910 #12 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250 #13 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87 #14 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074 #15 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461 #16 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714 #17 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781 #18 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912 #19 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52 #20 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24) previously allocated by thread T0 here: #0 0x7fdd4bad0459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x557cbebc1807 in xcalloc /home/phil/src/git/wrapper.c:140 #2 0x557cbe82b7d8 in cache_tree /home/phil/src/git/cache-tree.c:17 #3 0x557cbe82b7d8 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:763 #4 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764 #5 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764 #6 0x557cbe8304e1 in prime_cache_tree /home/phil/src/git/cache-tree.c:779 #7 0x557cbeab7fa7 in reset_head /home/phil/src/git/reset.c:85 #8 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074 #9 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461 #10 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714 #11 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781 #12 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912 #13 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52 #14 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24) Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
neerajsi-msft
pushed a commit
to neerajsi-msft/git
that referenced
this pull request
Oct 26, 2021
In a sparse index it is possible for the tree that is being verified to be freed while it is being verified. This happens when the index is sparse but the cache tree is not and index_name_pos() looks up a path from the cache tree that is a descendant of a sparse index entry. That triggers a call to ensure_full_index() which frees the cache tree that is being verified. Carrying on trying to verify the tree after this results in a use-after-free bug. Instead restart the verification if a sparse index is converted to a full index. This bug is triggered by a call to reset_head() in "git rebase --apply". Thanks to René Scharfe and Derrick Stolee for their help analyzing the problem. ==74345==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000001b20 at pc 0x557cbe82d3a2 bp 0x7ffdfee08090 sp 0x7ffdfee08080 READ of size 4 at 0x606000001b20 thread T0 #0 0x557cbe82d3a1 in verify_one /home/phil/src/git/cache-tree.c:863 #1 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 microsoft#2 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 microsoft#3 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 microsoft#4 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910 microsoft#5 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250 microsoft#6 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87 microsoft#7 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074 microsoft#8 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461 microsoft#9 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714 microsoft#10 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781 microsoft#11 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912 microsoft#12 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52 microsoft#13 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24) microsoft#14 0x557cbe5bcb8d in _start (/home/phil/src/git/git+0x1b9b8d) 0x606000001b20 is located 0 bytes inside of 56-byte region [0x606000001b20,0x606000001b58) freed by thread T0 here: #0 0x7fdd4bacff19 in __interceptor_free /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:127 #1 0x557cbe82af60 in cache_tree_free /home/phil/src/git/cache-tree.c:35 microsoft#2 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31 microsoft#3 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31 microsoft#4 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31 microsoft#5 0x557cbeb2557a in ensure_full_index /home/phil/src/git/sparse-index.c:310 microsoft#6 0x557cbea45c4a in index_name_stage_pos /home/phil/src/git/read-cache.c:588 microsoft#7 0x557cbe82ce37 in verify_one /home/phil/src/git/cache-tree.c:850 microsoft#8 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 microsoft#9 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 microsoft#10 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 microsoft#11 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910 microsoft#12 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250 microsoft#13 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87 microsoft#14 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074 microsoft#15 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461 microsoft#16 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714 microsoft#17 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781 microsoft#18 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912 microsoft#19 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52 microsoft#20 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24) previously allocated by thread T0 here: #0 0x7fdd4bad0459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x557cbebc1807 in xcalloc /home/phil/src/git/wrapper.c:140 microsoft#2 0x557cbe82b7d8 in cache_tree /home/phil/src/git/cache-tree.c:17 microsoft#3 0x557cbe82b7d8 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:763 microsoft#4 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764 microsoft#5 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764 microsoft#6 0x557cbe8304e1 in prime_cache_tree /home/phil/src/git/cache-tree.c:779 microsoft#7 0x557cbeab7fa7 in reset_head /home/phil/src/git/reset.c:85 microsoft#8 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074 microsoft#9 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461 microsoft#10 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714 microsoft#11 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781 microsoft#12 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912 microsoft#13 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52 microsoft#14 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24) Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
dscho
pushed a commit
that referenced
this pull request
Oct 30, 2021
In a sparse index it is possible for the tree that is being verified to be freed while it is being verified. This happens when the index is sparse but the cache tree is not and index_name_pos() looks up a path from the cache tree that is a descendant of a sparse index entry. That triggers a call to ensure_full_index() which frees the cache tree that is being verified. Carrying on trying to verify the tree after this results in a use-after-free bug. Instead restart the verification if a sparse index is converted to a full index. This bug is triggered by a call to reset_head() in "git rebase --apply". Thanks to René Scharfe and Derrick Stolee for their help analyzing the problem. ==74345==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000001b20 at pc 0x557cbe82d3a2 bp 0x7ffdfee08090 sp 0x7ffdfee08080 READ of size 4 at 0x606000001b20 thread T0 #0 0x557cbe82d3a1 in verify_one /home/phil/src/git/cache-tree.c:863 #1 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 #2 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 #3 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 #4 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910 #5 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250 #6 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87 #7 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074 #8 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461 #9 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714 #10 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781 #11 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912 #12 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52 #13 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24) #14 0x557cbe5bcb8d in _start (/home/phil/src/git/git+0x1b9b8d) 0x606000001b20 is located 0 bytes inside of 56-byte region [0x606000001b20,0x606000001b58) freed by thread T0 here: #0 0x7fdd4bacff19 in __interceptor_free /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:127 #1 0x557cbe82af60 in cache_tree_free /home/phil/src/git/cache-tree.c:35 #2 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31 #3 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31 #4 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31 #5 0x557cbeb2557a in ensure_full_index /home/phil/src/git/sparse-index.c:310 #6 0x557cbea45c4a in index_name_stage_pos /home/phil/src/git/read-cache.c:588 #7 0x557cbe82ce37 in verify_one /home/phil/src/git/cache-tree.c:850 #8 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 #9 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 #10 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840 #11 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910 #12 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250 #13 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87 #14 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074 #15 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461 #16 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714 #17 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781 #18 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912 #19 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52 #20 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24) previously allocated by thread T0 here: #0 0x7fdd4bad0459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x557cbebc1807 in xcalloc /home/phil/src/git/wrapper.c:140 #2 0x557cbe82b7d8 in cache_tree /home/phil/src/git/cache-tree.c:17 #3 0x557cbe82b7d8 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:763 #4 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764 #5 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764 #6 0x557cbe8304e1 in prime_cache_tree /home/phil/src/git/cache-tree.c:779 #7 0x557cbeab7fa7 in reset_head /home/phil/src/git/reset.c:85 #8 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074 #9 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461 #10 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714 #11 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781 #12 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912 #13 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52 #14 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24) Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
dscho
added a commit
that referenced
this pull request
Oct 9, 2024
An internal customer reported a segfault when running `git sparse-checkout set` with the `index.sparse` config enabled. I was unable to reproduce it locally, but with their help we debugged into the failing process and discovered the following stacktrace: ``` #0 0x00007ff6318fb7b0 in rehash (map=0x3dfb00d0440, newsize=1048576) at hashmap.c:125 #1 0x00007ff6318fbc66 in hashmap_add (map=0x3dfb00d0440, entry=0x3dfb5c58bc8) at hashmap.c:247 #2 0x00007ff631937a70 in hash_index_entry (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:122 #3 0x00007ff631938a2f in add_name_hash (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:638 #4 0x00007ff631a064de in set_index_entry (istate=0x3dfb00d0400, nr=8291, ce=0x3dfb5c58bc8) at sparse-index.c:255 #5 0x00007ff631a06692 in add_path_to_index (oid=0x5ff130, base=0x5ff580, path=0x3dfb4b725da "<redacted>", mode=33188, context=0x5ff570) at sparse-index.c:307 #6 0x00007ff631a3b48c in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41f60, base=0x5ff580, depth=2, pathspec=0x5ff5a0, fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:46 #7 0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41e80, base=0x5ff580, depth=1, pathspec=0x5ff5a0, fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80 #8 0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41ac8, base=0x5ff580, depth=0, pathspec=0x5ff5a0, fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80 #9 0x00007ff631a06a95 in expand_index (istate=0x3dfb00d0100, pl=0x0) at sparse-index.c:422 #10 0x00007ff631a06cbd in ensure_full_index (istate=0x3dfb00d0100) at sparse-index.c:456 #11 0x00007ff631990d08 in index_name_stage_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21, stage=0, search_mode=EXPAND_SPARSE) at read-cache.c:556 #12 0x00007ff631990d6c in index_name_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21) at read-cache.c:566 #13 0x00007ff63180dbb5 in sanitize_paths (argc=185, argv=0x3dfb0030018, prefix=0x0, skip_checks=0) at builtin/sparse-checkout.c:756 #14 0x00007ff63180de50 in sparse_checkout_set (argc=185, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:860 #15 0x00007ff63180e6c5 in cmd_sparse_checkout (argc=186, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:1063 #16 0x00007ff6317234cb in run_builtin (p=0x7ff631ad9b38 <commands+2808>, argc=187, argv=0x3dfb0030018) at git.c:548 #17 0x00007ff6317239c0 in handle_builtin (argc=187, argv=0x3dfb0030018) at git.c:808 #18 0x00007ff631723c7d in run_argv (argcp=0x5ffdd0, argv=0x5ffd78) at git.c:877 #19 0x00007ff6317241d1 in cmd_main (argc=187, argv=0x3dfb0030018) at git.c:1017 #20 0x00007ff631838b60 in main (argc=190, argv=0x3dfb0030000) at common-main.c:64 ``` The very bottom of the stack being the `rehash()` method from `hashmap.c` as called within the `name-hash` API made me look at where these hashmaps were being used in the sparse index logic. These were being copied across indexes, which seems dangerous. Indeed, clearing these hashmaps and setting them as not initialized fixes the segfault. The second commit is a response to a test failure that happens in `t1092-sparse-checkout-compatibility.sh` where `git stash pop` starts to fail because the underlying `git checkout-index` process fails due to colliding files. Passing the `-f` flag appears to work, but it's unclear why this name-hash change causes that change in behavior.
dscho
added a commit
that referenced
this pull request
Oct 9, 2024
An internal customer reported a segfault when running `git sparse-checkout set` with the `index.sparse` config enabled. I was unable to reproduce it locally, but with their help we debugged into the failing process and discovered the following stacktrace: ``` #0 0x00007ff6318fb7b0 in rehash (map=0x3dfb00d0440, newsize=1048576) at hashmap.c:125 #1 0x00007ff6318fbc66 in hashmap_add (map=0x3dfb00d0440, entry=0x3dfb5c58bc8) at hashmap.c:247 #2 0x00007ff631937a70 in hash_index_entry (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:122 #3 0x00007ff631938a2f in add_name_hash (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:638 #4 0x00007ff631a064de in set_index_entry (istate=0x3dfb00d0400, nr=8291, ce=0x3dfb5c58bc8) at sparse-index.c:255 #5 0x00007ff631a06692 in add_path_to_index (oid=0x5ff130, base=0x5ff580, path=0x3dfb4b725da "<redacted>", mode=33188, context=0x5ff570) at sparse-index.c:307 #6 0x00007ff631a3b48c in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41f60, base=0x5ff580, depth=2, pathspec=0x5ff5a0, fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:46 #7 0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41e80, base=0x5ff580, depth=1, pathspec=0x5ff5a0, fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80 #8 0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41ac8, base=0x5ff580, depth=0, pathspec=0x5ff5a0, fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80 #9 0x00007ff631a06a95 in expand_index (istate=0x3dfb00d0100, pl=0x0) at sparse-index.c:422 #10 0x00007ff631a06cbd in ensure_full_index (istate=0x3dfb00d0100) at sparse-index.c:456 #11 0x00007ff631990d08 in index_name_stage_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21, stage=0, search_mode=EXPAND_SPARSE) at read-cache.c:556 #12 0x00007ff631990d6c in index_name_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21) at read-cache.c:566 #13 0x00007ff63180dbb5 in sanitize_paths (argc=185, argv=0x3dfb0030018, prefix=0x0, skip_checks=0) at builtin/sparse-checkout.c:756 #14 0x00007ff63180de50 in sparse_checkout_set (argc=185, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:860 #15 0x00007ff63180e6c5 in cmd_sparse_checkout (argc=186, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:1063 #16 0x00007ff6317234cb in run_builtin (p=0x7ff631ad9b38 <commands+2808>, argc=187, argv=0x3dfb0030018) at git.c:548 #17 0x00007ff6317239c0 in handle_builtin (argc=187, argv=0x3dfb0030018) at git.c:808 #18 0x00007ff631723c7d in run_argv (argcp=0x5ffdd0, argv=0x5ffd78) at git.c:877 #19 0x00007ff6317241d1 in cmd_main (argc=187, argv=0x3dfb0030018) at git.c:1017 #20 0x00007ff631838b60 in main (argc=190, argv=0x3dfb0030000) at common-main.c:64 ``` The very bottom of the stack being the `rehash()` method from `hashmap.c` as called within the `name-hash` API made me look at where these hashmaps were being used in the sparse index logic. These were being copied across indexes, which seems dangerous. Indeed, clearing these hashmaps and setting them as not initialized fixes the segfault. The second commit is a response to a test failure that happens in `t1092-sparse-checkout-compatibility.sh` where `git stash pop` starts to fail because the underlying `git checkout-index` process fails due to colliding files. Passing the `-f` flag appears to work, but it's unclear why this name-hash change causes that change in behavior.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch helps address the CPU cost of loading the index by creating threads to divide the work of loading and converting the cache across all available CPU cores.
It accomplishes this by having the primary thread loop across the index file the offset and (for V4 indexes) expanding the name. It creates a to process each block of entries as it comes to them. Once the are complete and the cache entries are loaded, the rest of the can be loaded and processed normally on the primary thread.
Signed-off-by: Ben Peart Ben.Peart@microsoft.com