-
Notifications
You must be signed in to change notification settings - Fork 1
Add find_bisect_testcases and fix merge_bases_count #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
gaalcaras
wants to merge
2
commits into
chriscool:bisect-sprint-work
Choose a base branch
from
gaalcaras:bisect-sprint-work
base: bisect-sprint-work
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Add find_bisect_testcases and fix merge_bases_count #2
gaalcaras
wants to merge
2
commits into
chriscool:bisect-sprint-work
from
gaalcaras:bisect-sprint-work
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
chriscool
pushed a commit
that referenced
this pull request
Mar 26, 2018
The function ce_write_entry() uses a 'self-initialised' variable construct, for the symbol 'saved_namelen', to suppress a gcc '-Wmaybe-uninitialized' warning, given that the warning is a false positive. For the purposes of this discussion, the ce_write_entry() function has three code blocks of interest, that look like so: /* block #1 */ if (ce->ce_flags & CE_STRIP_NAME) { saved_namelen = ce_namelen(ce); ce->ce_namelen = 0; } /* block #2 */ /* * several code blocks that contain, among others, calls * to copy_cache_entry_to_ondisk(ondisk, ce); */ /* block #3 */ if (ce->ce_flags & CE_STRIP_NAME) { ce->ce_namelen = saved_namelen; ce->ce_flags &= ~CE_STRIP_NAME; } The warning implies that gcc thinks it is possible that the first block is not entered, the calls to copy_cache_entry_to_ondisk() could toggle the CE_STRIP_NAME flag on, thereby entering block #3 with saved_namelen unset. However, the copy_cache_entry_to_ondisk() function does not write to ce->ce_flags (it only reads). gcc could easily determine this, since that function is local to this file, but it obviously doesn't. In order to suppress this warning, we make it clear to the reader (human and compiler), that block #3 will only be entered when the first block has been entered, by introducing a new 'stripped_name' boolean variable. We also take the opportunity to change the type of 'saved_namelen' to 'unsigned int' to match ce->ce_namelen. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Jun 22, 2018
sequencer: introduce the `merge` command This patch is part of the effort to reimplement `--preserve-merges` with a substantially improved design, a design that has been developed in the Git for Windows project to maintain the dozens of Windows-specific patch series on top of upstream Git. The previous patch implemented the `label` and `reset` commands to label commits and to reset to labeled commits. This patch adds the `merge` command, with the following syntax: merge [-C <commit>] <rev> # <oneline> The <commit> parameter in this instance is the *original* merge commit, whose author and message will be used for the merge commit that is about to be created. The <rev> parameter refers to the (possibly rewritten) revision to merge. Let's see an example of a todo list: label onto # Branch abc reset onto pick deadbee Hello, world! label abc reset onto pick cafecafe And now for something completely different merge -C baaabaaa abc # Merge the branch 'abc' into master To edit the merge commit's message (a "reword" for merges, if you will), use `-c` (lower-case) instead of `-C`; this convention was borrowed from `git commit` that also supports `-c` and `-C` with similar meanings. To create *new* merges, i.e. without copying the commit message from an existing commit, simply omit the `-C <commit>` parameter (which will open an editor for the merge message): merge abc This comes in handy when splitting a branch into two or more branches. Note: this patch only adds support for recursive merges, to keep things simple. Support for octopus merges will be added later in a separate patch series, support for merges using strategies other than the recursive merge is left for the future. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> # The commit message #2 will be skipped: # fixup! sequencer: introduce the `merge` command Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Nov 28, 2019
Signed-off-by: Alessandro Menti <alessandro.menti@alessandromenti.it>
chriscool
pushed a commit
that referenced
this pull request
Nov 28, 2019
* 'update-italian-translation' of github.com:AlessandroMenti/git-po: l10n: it.po: update the Italian translation for Git 2.24.0 round #2
chriscool
pushed a commit
that referenced
this pull request
Nov 28, 2019
l10n-2.24.0-rnd2 * tag 'l10n-2.24.0-rnd2' of https://github.com/git-l10n/git-po: l10n: zh_CN: for git v2.24.0 l10n round 1~2 l10n: de.po: Update German translation l10n: sv.po: Update Swedish translation (4695t0f0u) l10n: bg.po: Updated Bulgarian translation (4694) l10n: vi(4694t): Updated translation for v2.24.0 l10n: es: 2.24.0 round 2 l10n: it.po: update the Italian translation for Git 2.24.0 round #2 l10n: fr v2.24.0 rnd2 l10n: git.pot: v2.24.0 round 2 (1 new) l10n: it.po: update the Italian translation for Git 2.24.0 l10n: fr 2.24.0 rnd 1 l10n: git.pot: v2.24.0 round 1 (35 new, 16 removed) l10n: bg.po: Updated Bulgarian translation (4693) l10n: sv.po: Update Swedish translation (4674t0f0u) l10n: Update Catalan translation
chriscool
pushed a commit
that referenced
this pull request
Apr 22, 2020
…ev() In 'builtin/name-rev.c' in the name_rev() function there is a loop iterating over all parents of the given commit, and the loop body looks like this: if (parent_number > 1) { if (generation > 0) // branch #1 new_name = ... else // branch #2 new_name = ... name_rev(parent, new_name, ...); } else { // branch #3 name_rev(...); } These conditions are not covered properly in the test suite. As far as purely test coverage goes, they are all executed several times over in 't6120-describe.sh'. However, they don't directly influence the command's output, because the repository used in that test script contains several branches and tags pointing somewhere into the middle of the commit DAG, and thus result in a better name for the to-be-named commit. This can hide bugs: e.g. by replacing the 'new_name' parameter of the first recursive name_rev() call with 'tip_name' (effectively making both branch #1 and #2 a noop) 'git name-rev --all' shows thousands of bogus names in the Git repository, but the whole test suite still passes successfully. In an early version of a later patch in this series I managed to mess up all three branches (at once!), but the test suite still passed. So add a new test case that operates on the following history: A--------------master \ / \----------M2 \ / \---M1-C \ / B and names the commit 'B' to make sure that all three branches are crucial to determine 'B's name: - There is only a single ref, so all names are based on 'master', without any undesired interference from other refs. - Each time name_rev() follows the second parent of a merge commit, it appends "^2" to the name. Following 'master's second parent right at the start ensures that all commits on the ancestry path from 'master' to 'B' have a different base name from the original 'tip_name' of the very first name_rev() invocation. Currently, while name_rev() is recursive, it doesn't matter, but it will be necessary to properly cover all three branches after the recursion is eliminated later in this series. - Following 'M2's second parent makes sure that branch #2 (i.e. when 'generation = 0') affects 'B's name. - Following the only parent of the non-merge commit 'C' ensures that branch #3 affects 'B's name, and that it increments 'generation'. - Coming from 'C' 'generation' is 1, thus following 'M1's second parent makes sure that branch #1 affects 'B's name. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Apr 22, 2020
Recent versions of the gcc and clang Address Sanitizer produce test failures related to regexec(). This triggers with gcc-10 and clang-8 (but not gcc-9 nor clang-7). Running: make CC=gcc-10 SANITIZE=address test results in failures in t4018, t3206, and t4062. The cause seems to be that when built with ASan, we use a different version of regexec() than normal. And this version doesn't understand the REG_STARTEND flag. Here's my evidence supporting that. The failure in t4062 is an ASan warning: expecting success of 4062.2 '-G matches': git diff --name-only -G "^(0{64}){64}$" HEAD^ >out && test 4096-zeroes.txt = "$(cat out)" ================================================================= ==672994==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7fa76f672000 at pc 0x7fa7726f75b6 bp 0x7ffe41bdda70 sp 0x7ffe41bdd220 READ of size 4097 at 0x7fa76f672000 thread T0 #0 0x7fa7726f75b5 (/lib/x86_64-linux-gnu/libasan.so.6+0x4f5b5) #1 0x562ae0c9c40e in regexec_buf /home/peff/compile/git/git-compat-util.h:1117 #2 0x562ae0c9c40e in diff_grep /home/peff/compile/git/diffcore-pickaxe.c:52 #3 0x562ae0c9cc28 in pickaxe_match /home/peff/compile/git/diffcore-pickaxe.c:166 [...] In this case we're looking in a buffer which was mmap'd via reuse_worktree_file(), and whose size is 4096 bytes. But libasan's regex tries to look at byte 4097 anyway! If we tweak Git like this: diff --git a/diff.c b/diff.c index 8e2914c031..cfae60c120 100644 --- a/diff.c +++ b/diff.c @@ -3880,7 +3880,7 @@ static int reuse_worktree_file(struct index_state *istate, */ if (ce_uptodate(ce) || (!lstat(name, &st) && !ie_match_stat(istate, ce, &st, 0))) - return 1; + return 0; return 0; } to use a regular buffer (with a trailing NUL) instead of an mmap, then the complaint goes away. The other failures are actually diff output with an incorrect funcname header. If I instrument xdiff to show the funcname matching like so: diff --git a/xdiff-interface.c b/xdiff-interface.c index 8509f9ea22..f6c3dc1986 100644 --- a/xdiff-interface.c +++ b/xdiff-interface.c @@ -197,6 +197,7 @@ struct ff_regs { struct ff_reg { regex_t re; int negate; + char *printable; } *array; }; @@ -218,7 +219,12 @@ static long ff_regexp(const char *line, long len, for (i = 0; i < regs->nr; i++) { struct ff_reg *reg = regs->array + i; - if (!regexec_buf(®->re, line, len, 2, pmatch, 0)) { + int ret = regexec_buf(®->re, line, len, 2, pmatch, 0); + warning("regexec %s:\n regex: %s\n buf: %.*s", + ret == 0 ? "matched" : "did not match", + reg->printable, + (int)len, line); + if (!ret) { if (reg->negate) return -1; break; @@ -264,6 +270,7 @@ void xdiff_set_find_func(xdemitconf_t *xecfg, const char *value, int cflags) expression = value; if (regcomp(®->re, expression, cflags)) die("Invalid regexp to look for hunk header: %s", expression); + reg->printable = xstrdup(expression); free(buffer); value = ep + 1; } then when compiling with ASan and gcc-10, running the diff from t4018.66 produces this: $ git diff -U1 cpp-skip-access-specifiers warning: regexec did not match: regex: ^[ ]*[A-Za-z_][A-Za-z_0-9]*:[[:space:]]*($|/[/*]) buf: private: warning: regexec matched: regex: ^((::[[:space:]]*)?[A-Za-z_].*)$ buf: private: diff --git a/cpp-skip-access-specifiers b/cpp-skip-access-specifiers index 4d4a9db..ebd6f42 100644 --- a/cpp-skip-access-specifiers +++ b/cpp-skip-access-specifiers @@ -6,3 +6,3 @@ private: void DoSomething(); int ChangeMe; }; void DoSomething(); - int ChangeMe; + int IWasChanged; }; That first regex should match (and is negated, so it should be telling us _not_ to match "private:"). But it wouldn't if regexec() is looking at the whole buffer, and not just the length-limited line we've fed to regexec_buf(). So this is consistent again with REG_STARTEND being ignored. The correct output (compiling without ASan, or gcc-9 with Asan) looks like this: warning: regexec matched: regex: ^[ ]*[A-Za-z_][A-Za-z_0-9]*:[[:space:]]*($|/[/*]) buf: private: [...more lines that we end up not using...] warning: regexec matched: regex: ^((::[[:space:]]*)?[A-Za-z_].*)$ buf: class RIGHT : public Baseclass diff --git a/cpp-skip-access-specifiers b/cpp-skip-access-specifiers index 4d4a9db..ebd6f42 100644 --- a/cpp-skip-access-specifiers +++ b/cpp-skip-access-specifiers @@ -6,3 +6,3 @@ class RIGHT : public Baseclass void DoSomething(); - int ChangeMe; + int IWasChanged; }; So it really does seem like libasan's regex engine is ignoring REG_STARTEND. We should be able to work around it by compiling with NO_REGEX, which would use our local regexec(). But to make matters even more interesting, this isn't enough by itself. Because ASan has support from the compiler, it doesn't seem to intercept our call to regexec() at the dynamic library level. It actually recognizes when we are compiling a call to regexec() and replaces it with ASan-specific code at that point. And unlike most of our other compat code, where we might have git_mmap() or similar, the actual symbol name in the compiled compat/regex code is regexec(). So just compiling with NO_REGEX isn't enough; we still end up in libasan! We can work around that by having the preprocessor replace regexec with git_regexec (both in the callers and in the actual implementation), and we truly end up with a call to our custom regex code, even when compiling with ASan. That's probably a good thing to do anyway, as it means anybody looking at the symbols later (e.g., in a debugger) would have a better indication of which function is which. So we'll do the same for the other common regex functions (even though just regexec() is enough to fix this ASan problem). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Jun 12, 2020
Signed-off-by: Alessandro Menti <alessandro.menti@alessandromenti.it>
chriscool
pushed a commit
that referenced
this pull request
Jun 12, 2020
* 'update-italian-translation' of github.com:AlessandroMenti/git-po: l10n: it.po: update for Git 2.27.0 round #2
chriscool
pushed a commit
that referenced
this pull request
Jun 12, 2020
l10n-2.27.0-rnd2 * tag 'l10n-2.27.0-rnd2' of git://github.com/git-l10n/git-po: (23 commits) l10n: zh_TW.po: v2.27.0 round 2 (0 untranslated) l10n: zh_TW.po: v2.27.0 round 1 (0 untranslated) l10n: de.po: Fix typo in the German translation of octopus l10n: de.po: Update German translation for Git 2.27.0 l10n: it.po: update for Git 2.27.0 round #2 l10n: tr: v2.27.0 round 2 l10n: fr.po v2.27.0 rnd 2 l10n: bg.po: Updated Bulgarian translation (4875t) l10n: Update Catalan translation l10n: sv.po: Update Swedish translation (4875t0f0u) l10n: vi(4875t): Updated Vietnamses translation for 2.27.0rd2 l10n: zh_CN: for git v2.27.0 l10n round 1~2 l10n: git.pot: v2.27.0 round 2 (+1) l10n: Update Catalan translation l10n: vi(4874t): Updated Vietnamses translation for 2.27.0 l10n: es: 2.27.0 round 1 l10n: bg.po: Updated Bulgarian translation (4868t) l10n: fr v2.27.0 rnd 1 l10n: sv.po: Update Swedish translation (4839t0f0u) l10n: tr: v2.27.0 round 1 ...
chriscool
pushed a commit
that referenced
this pull request
Jul 20, 2020
The Chunk Lookup table stores the chunks' starting offset in the commit-graph file, not their sizes. Consequently, the size of a chunk can only be calculated by subtracting its offset from the offset of the subsequent chunk (or that of the terminating label). This is currenly implemented in a bit complicated way: as we iterate over the entries of the Chunk Lookup table, we check the id of each chunk and store its starting offset, then we check the id of the last seen chunk and calculate its size using its previously saved offset. At the moment there is only one chunk for which we calculate its size, but this patch series will add more, and the repeated chunk id checks are not that pretty. Instead let's read ahead the offset of the next chunk on each iteration, so we can calculate the size of each chunk right away, right where we store its starting offset. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Jul 20, 2020
Unify the 'chunk_ids' and 'chunk_sizes' arrays into an array of 'struct chunk_info'. This will allow more cleanups in the following patches. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Jul 20, 2020
The early part of t9100 creates an unusual "doubled" history in the "git-svn" ref. When we get to t9100.17, it looks like this: $ git log --oneline --graph git-svn [...] * efd0303 detect node change from file to directory #2 |\ * | 3e727c0 detect node change from file to directory #2 |/ * 3b00468 try a deep --rmdir with a commit |\ * | b4832d8 try a deep --rmdir with a commit |/ * f0d7bd5 import for git svn Each commit we make with "git commit" is paired with one from "git svn set-tree", with the latter as a merge of the first and its grandparent. Later, t9100.17 wants to check that "git svn fetch" gets the same trees. And it does, but just one copy of each. So it uses rev-list to get the tree of each commit and pipes it to "uniq" to drop the duplicates. Our input isn't sorted, but it will find adjacent duplicates. This works reliably because the order of commits from rev-list always shows the duplicates next to each other. For any one of those merges, we could choose to show its duplicate or the grandparent first. But barring clocks running backwards, the duplicate will always have a time equal to or greater than the grandparent. Even if equal, we break ties by showing the first-parent first, so the duplicates remain adjacent. But this would break if the timestamps stopped moving in chronological order. Normally we would rely on test_tick for this, but we have _two_ sources of time here: - "git commit" creates one commit based on GIT_COMMITTER_DATE (which respects test_tick) - the "svn set-tree" one is based on subversion, which does not have an easy way to specify a timestamp So using test_tick actually breaks the test, because now the duplicates are far in the past, and we'll show the grandparent before the duplicate. And likewise, a proposed change to set GIT_COMMITTER_DATE in all scripts will break it. We _could_ fix this by sorting before removing duplicates, but presumably it's a useful part of the test to make sure the trees appear in the same order in both spots. Likewise, we could use something like: perl -ne 'print unless $seen{$_}++' to remove duplicates without impacting the order. But that doesn't work either, because there are actually multiple (non-duplicate) commits with the same trees (we change a file mode and then change it back). So we'd actually have to de-duplicate the combination of subject and tree. Which then further throws off t9100.18, which compares the tree hashes exactly; we'd have to strip the result back down. Since this test _isn't_ buggy, the simplest thing is to just work around the proposed change by documenting our expectation that git-created commits are correctly interleaved using the current time. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Jul 20, 2020
The changed-path Bloom filter is improved using ideas from an independent implementation. * sg/commit-graph-cleanups: commit-graph: simplify write_commit_graph_file() #2 commit-graph: simplify write_commit_graph_file() #1 commit-graph: simplify parse_commit_graph() #2 commit-graph: simplify parse_commit_graph() #1 commit-graph: clean up #includes diff.h: drop diff_tree_oid() & friends' return value commit-slab: add a function to deep free entries on the slab commit-graph-format.txt: all multi-byte numbers are in network byte order commit-graph: fix parsing the Chunk Lookup table tree-walk.c: don't match submodule entries for 'submod/anything'
chriscool
pushed a commit
that referenced
this pull request
Jul 22, 2020
The changed-path Bloom filter is improved using ideas from an independent implementation. * sg/commit-graph-cleanups: commit-graph: simplify write_commit_graph_file() #2 commit-graph: simplify write_commit_graph_file() #1 commit-graph: simplify parse_commit_graph() #2 commit-graph: simplify parse_commit_graph() #1 commit-graph: clean up #includes diff.h: drop diff_tree_oid() & friends' return value commit-slab: add a function to deep free entries on the slab commit-graph-format.txt: all multi-byte numbers are in network byte order commit-graph: fix parsing the Chunk Lookup table tree-walk.c: don't match submodule entries for 'submod/anything'
chriscool
pushed a commit
that referenced
this pull request
Aug 4, 2020
The changed-path Bloom filter is improved using ideas from an independent implementation. * sg/commit-graph-cleanups: commit-graph: simplify write_commit_graph_file() #2 commit-graph: simplify write_commit_graph_file() #1 commit-graph: simplify parse_commit_graph() #2 commit-graph: simplify parse_commit_graph() #1 commit-graph: clean up #includes diff.h: drop diff_tree_oid() & friends' return value commit-slab: add a function to deep free entries on the slab commit-graph-format.txt: all multi-byte numbers are in network byte order commit-graph: fix parsing the Chunk Lookup table tree-walk.c: don't match submodule entries for 'submod/anything'
chriscool
pushed a commit
that referenced
this pull request
Dec 16, 2020
In demultiplex_sideband(), there are two oddities when we check an incoming packet: - if it has zero length, then we assume it's a flush packet. This means we fail to notice the difference between a real flush and a true zero-length packet that's missing its sideband designator. It's not a huge problem in practice because we'd never send a zero-length data packet (even our keepalives are otherwise-empty sideband-1 packets). But it would be nice to detect and report the error, since it's likely to cause other confusion (we think the other side flushed, but they do not). - we try to detect packets missing their designator by checking for "if (len < 1)". But this will never trigger for "len == 0"; we've already detected that and left the function before then. It _could_ detect a negative "len" parameter. But in that case, the error message is wrong. The issue is not "no sideband" but rather "eof while reading the packet". However, this can't actually be triggered in practice, because neither of the two callers uses pkt_read's GENTLE_ON_EOF flag. Which means they'd die with "the remote end hung up unexpectedly" before we even get here. So this truly is dead code. We can improve these cases by passing in a pkt-line status to the demultiplexer, and by having recv_sideband() use GENTLE_ON_EOF. This gives us two improvements: - we can now reliably detect flush packets, and will report a normal packet missing its sideband designator as an error - we'll report an eof with a more detailed "protocol error: eof while reading sideband packet", rather than the generic "the remote end hung up unexpectedly" - when we see an eof, we'll flush the sideband scratch buffer, which may provide some hints from the remote about why they hung up (though note we already flush on newlines, so it's likely that most such messages already made it through) In some sense this patch goes against fbd76cd (sideband: reverse its dependency on pkt-line, 2019-01-16), which caused the sideband code not to depend on the pkt-line code. But that commit was really just trying to deal with the circular header dependency. The two modules are conceptually interlinked, and it was just trying to keep things compiling. And indeed, there's a sticking point in this patch: because pkt-line.h includes sideband.h, we can't add the reverse include we need for the sideband code to have an "enum packet_read_status" parameter. Nor can we forward declare it, because you can't forward declare an enum in C. However, C does guarantee that enums fit in an int, so we can just use that type. One alternative would be for the callers to check themselves that they got something sane from the pkt-line code. But besides duplicating logic, this gets quite tricky. Any error condition requires flushing the sideband #2 scratch buffer, which only demultiplex_sideband() knows how to do. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Aug 5, 2023
When "update-index --unresolve $path" cannot find the resolve-undo record for the path the user requested to unresolve, it stuffs the blobs from HEAD and MERGE_HEAD to stage #2 and stage #3 as a fallback. For this reason, the operation does not even start unless both "HEAD" and "MERGE_HEAD" exist. This is suboptimal in a few ways: * It does not recreate stage #1. Even though it is a correct design decision not to do so (because it is impossible to recreate in general cases, without knowing how we got there, including what merge strategy was used), it is much less useful not to have that information in the index. * It limits the "unresolve" operation only during a conflicted "git merge" and nothing else. Other operations like "rebase", "cherry-pick", and "switch -m" may result in conflicts, and the user may want to unresolve the conflict that they incorrectly resolved in order to redo the resolution, but the fallback would not kick in. * Most importantly, the entire "unresolve" operation is disabled after a conflicted merge is committed and MERGE_HEAD is removed, even though the index has perfectly usable resolve-undo records. By lazily reading the HEAD and MERGE_HEAD only when we need to go to the fallback codepath, we will allow cases where resolve-undo records are available (which is 100% of the time, unless the user is reading from an index file created by Git more than 10 years ago) to proceed even after a conflicted merge was committed, during other mergy operations that do not use MERGE_HEAD, or after the result of such mergy operations has been committed. Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Aug 5, 2023
The helper function is responsible for inspecting the index and deciding if the path is merged, is conflicted in a way that we would want to handle, or is conflicted in a way that we cannot handle. Currently, only conflicts with both stage #2 and stage #3 are handled, but eventually we would want to be able to deal with delete-modify conflicts (i.e. only one of stages #2 and #3 exist). Streamline the implementation of the function to make it easier to extend. Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Sep 8, 2023
When t5583-push-branches.sh was originally introduced via 425b4d7 (push: introduce '--branches' option, 2023-05-06), it was not leak-free. In fact, the test did not even run correctly until 022fbb6 (t5583: fix shebang line, 2023-05-12), but after applying that patch, we see a failure at t5583.8: ==2529087==ERROR: LeakSanitizer: detected memory leaks Direct leak of 384 byte(s) in 1 object(s) allocated from: #0 0x7fb536330986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98 #1 0x55e07606cbf9 in xrealloc wrapper.c:140 #2 0x55e075fb6cb3 in prio_queue_put prio-queue.c:42 #3 0x55e075ec81cb in get_reachable_subset commit-reach.c:917 #4 0x55e075fe9cce in add_missing_tags remote.c:1518 #5 0x55e075fea1e4 in match_push_refs remote.c:1665 git#6 0x55e076050a8e in transport_push transport.c:1378 #7 0x55e075e2eb74 in push_with_options builtin/push.c:401 git#8 0x55e075e2edb0 in do_push builtin/push.c:458 git#9 0x55e075e2ff7a in cmd_push builtin/push.c:702 git#10 0x55e075d8aaf0 in run_builtin git.c:452 git#11 0x55e075d8af08 in handle_builtin git.c:706 git#12 0x55e075d8b12c in run_argv git.c:770 git#13 0x55e075d8b6a0 in cmd_main git.c:905 git#14 0x55e075e81f07 in main common-main.c:60 git#15 0x7fb5360ab6c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 git#16 0x7fb5360ab784 in __libc_start_main_impl ../csu/libc-start.c:360 git#17 0x55e075d88f40 in _start (git+0x1ff40) (BuildId: 38ad998b85a535e786129979443630d025ec2453) SUMMARY: LeakSanitizer: 384 byte(s) leaked in 1 allocation(s). This leak was addressed independently via 68b5117 (commit-reach: fix memory leak in get_reachable_subset(), 2023-06-03), which makes t5583 leak-free. But t5583 was not in the tree when 68b5117 was written, and the two only met after the latter was merged back in via 693bde4 (Merge branch 'mh/commit-reach-get-reachable-plug-leak', 2023-06-20). At that point, t5583 was leak-free. Let's mark it as such accordingly. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Jan 10, 2024
It is tempting to think of "files and directories" of the current directory as valid inputs to the add and set subcommands of git sparse-checkout. However, in non-cone mode, they often aren't and using them as potential completions leads to *many* forms of confusion: Issue #1. It provides the *wrong* files and directories. For git sparse-checkout add we always want to add files and directories not currently in our sparse checkout, which means we want file and directories not currently present in the current working tree. Providing the files and directories currently present is thus always wrong. For git sparse-checkout set we have a similar problem except in the subset of cases where we are trying to narrow our checkout to a strict subset of what we already have. That is not a very common scenario, especially since it often does not even happen to be true for the first use of the command; for years we required users to create a sparse-checkout via git sparse-checkout init git sparse-checkout set <args...> (or use a clone option that did the init step for you at clone time). The init command creates a minimal sparse-checkout with just the top-level directory present, meaning the set command has to be used to expand the checkout. Thus, only in a special and perhaps unusual cases would any of the suggestions from normal file and directory completion be appropriate. Issue #2: Suggesting patterns that lead to warnings is unfriendly. If the user specifies any regular file and omits the leading '/', then the sparse-checkout command will warn the user that their command is problematic and suggest they use a leading slash instead. Issue #3: Completion gets confused by leading '/', and provides wrong paths. Users often want to anchor their patterns to the toplevel of the repository, especially when listing individual files. There are a number of reasons for this, but notably even sparse-checkout encourages them to do so (as noted above). However, if users do so (via adding a leading '/' to their pattern), then bash completion will interpret the leading slash not as a request for a path at the toplevel of the repository, but as a request for a path at the root of the filesytem. That means at best that completion cannot help with such paths, and if it does find any completions, they are almost guaranteed to be wrong. Issue #4: Suggesting invalid patterns from subdirectories is unfriendly. There is no per-directory equivalent to .gitignore with sparse-checkouts. There is only a single worktree-global $GIT_DIR/info/sparse-checkout file. As such, paths to files must be specified relative to the toplevel of a repository. Providing suggestions of paths that are relative to the current working directory, as bash completion defaults to, is wrong when the current working directory is not the worktree toplevel directory. Issue #5: Paths with special characters will be interpreted incorrectly The entries in the sparse-checkout file are patterns, not paths. While most paths also qualify as patterns (though even in such cases it would be better for users to not use them directly but prefix them with a leading '/'), there are a variety of special characters that would need special escaping beyond the normal shell escaping: '*', '?', '\', '[', ']', and any leading '#' or '!'. If completion suggests any such paths, users will likely expect them to be treated as an exact path rather than as a pattern that might match some number of files other than 1. However, despite the first four issues, we can note that _if_ users are using tab completion, then they are probably trying to specify a path in the index. As such, we transform their argument into a top-level-rooted pattern that matches such a file. For example, if they type: git sparse-checkout add Make<TAB> we could "complete" to git sparse-checkout add /Makefile or, if they ran from the Documentation/technical/ subdirectory: git sparse-checkout add m<TAB> we could "complete" it to: git sparse-checkout add /Documentation/technical/multi-pack-index.txt Note in both cases I use "complete" in quotes, because we actually add characters both before and after the argument in question, so we are kind of abusing "bash completions" to be "bash completions AND beginnings". The fifth issue is a bit stickier, especially when you consider that we not only need to deal with escaping issues because of special meanings of patterns in sparse-checkout & gitignore files, but also that we need to consider escaping issues due to ls-files needing to sometimes quote or escape characters, and because the shell needs to escape some characters. The multiple interacting forms of escaping could get ugly; this patch makes no attempt to do so and simply documents that we decided to not deal with those corner cases for now but at least get the common cases right. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
May 24, 2024
When reusing objects from a pack, we keep track of a set of one or more `reused_chunk`s, corresponding to sections of one or more object(s) from a source pack that we are reusing. Each chunk contains two pieces of information: - the offset of the first object in the source pack (relative to the beginning of the source pack) - the difference between that offset, and the corresponding offset in the pack we're generating The purpose of keeping track of these is so that we can patch an OFS_DELTAs that cross over a section of the reuse pack that we didn't take. For instance, consider a hypothetical pack as shown below: (chunk #2) __________... / / +--------+---------+-------------------+---------+ ... | <base> | <other> | (unused) | <delta> | ... +--------+---------+-------------------+---------+ \ / \______________/ (chunk #1) Suppose that we are sending objects "base", "other", and "delta", and that the "delta" object is stored as an OFS_DELTA, and that its base is "base". If we don't send any objects in the "(unused)" range, we can't copy the delta'd object directly, since its delta offset includes a range of the pack that we didn't copy, so we have to account for that difference when patching and reassembling the delta. In order to compute this value correctly, we need to know not only where we are in the packfile we're assembling (with `hashfile_total(f)`) but also the position of the first byte of the packfile that we are currently reusing. Currently, this works just fine, since when reusing only a single pack those two values are always identical (because verbatim reuse is the first thing pack-objects does when enabled after writing the pack header). But when reusing multiple packs which have one or more gaps, we'll need to account for these two values diverging. Together, these two allow us to compute the reused chunk's offset difference relative to the start of the reused pack, as desired. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
May 24, 2024
The t5309 script triggers a racy false positive with SANITIZE=leak on a multi-core system. Running with "--stress --run=6" usually fails within 10 seconds or so for me, complaining with something like: + git index-pack --fix-thin --stdin fatal: REF_DELTA at offset 46 already resolved (duplicate base 01d7713666f4de822776c7622c10f1b07de280dc?) ================================================================= ==3904583==ERROR: LeakSanitizer: detected memory leaks Direct leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98 #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180 #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150 #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598 #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51 #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440 git#6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444 #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 SUMMARY: LeakSanitizer: 32 byte(s) leaked in 1 allocation(s). Aborted What happens is this: 0. We construct a bogus pack with a duplicate object in it and trigger index-pack. 1. We spawn a bunch of worker threads to resolve deltas (on my system it is 16 threads). 2. One of the threads sees the duplicate object and bails by calling exit(), taking down all of the threads. This is expected and is the point of the test. 3. At the time exit() is called, we may still be spawning threads from the main process via pthread_create(). LSan hooks thread creation to update its book-keeping; it has to know where each thread's stack is (so it can find entry points for reachable memory). So it calls pthread_getattr_np() to get information about the new thread. That may allocate memory that must be freed with a matching call to pthread_attr_destroy(). Probably LSan does that immediately, but if you're unlucky enough, the exit() will happen while it's between those two calls, and the allocated pthread_attr_t appears as a leak. This isn't a real leak. It's not even in our code, but rather in the LSan instrumentation code. So we could just ignore it. But the false positive can cause people to waste time tracking it down. It's possibly something that LSan could protect against (e.g., cover the getattr/destroy pair with a mutex, and then in the final post-exit() check for leaks try to take the same mutex). But I don't know enough about LSan to say if that's a reasonable approach or not (or if my analysis is even completely correct). In the meantime, it's pretty easy to avoid the race by making creation of the worker threads "atomic". That is, we'll spawn all of them before letting any of them start to work. That's easy to do because we already have a work_lock() mutex for handing out that work. If the main process takes it, then all of the threads will immediately block until we've finished spawning and released it. This shouldn't make any practical difference for non-LSan runs. The thread spawning is quick, and could happen before any worker thread gets scheduled anyway. Probably other spots that use threads are subject to the same issues. But since we have to manually insert locking (and since this really is kind of a hack), let's not bother with them unless somebody experiences a similar racy false-positive in practice. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Jul 31, 2024
Commit d6a8c58 (midx-write.c: support reading an existing MIDX with `packs_to_include`, 2024-05-29) changed the MIDX generation machinery to support reading from an existing MIDX when writing a new one. Unfortunately, the rest of the MIDX generation machinery is not prepared to deal with such a change. For instance, the function responsible for adding to the object ID fanout table from a MIDX source (midx_fanout_add_midx_fanout()) will gladly add objects from an existing MIDX for some fanout level regardless of whether or not those objects came from packs that are to be included in the subsequent MIDX write. This results in broken pseudo-pack object order (leading to incorrect object traversal results) and segmentation faults, like so (generated by running the added test prior to the changes in midx-write.c): #0 0x000055ee31393f47 in midx_pack_order (ctx=0x7ffdde205c70) at midx-write.c:590 #1 0x000055ee31395a69 in write_midx_internal (object_dir=0x55ee32570440 ".git/objects", packs_to_include=0x7ffdde205e20, packs_to_drop=0x0, preferred_pack_name=0x0, refs_snapshot=0x0, flags=15) at midx-write.c:1171 #2 0x000055ee31395f38 in write_midx_file_only (object_dir=0x55ee32570440 ".git/objects", packs_to_include=0x7ffdde205e20, preferred_pack_name=0x0, refs_snapshot=0x0, flags=15) at midx-write.c:1274 [...] In stack frame #0, the code on midx-write.c:590 is using the new pack ID corresponding to some object which was added from the existing MIDX. Importantly, the pack from which that object was selected in the existing MIDX does not appear in the new MIDX as it was excluded via `--stdin-packs`. In this instance, the pack in question had pack ID "1" in the existing MIDX, but since it was excluded from the new MIDX, we never filled in that entry in the pack_perm table, resulting in: (gdb) p *ctx->pack_perm@2 $1 = {0, 1515870810} Which is what causes the segfault above when we try and read: struct pack_info *pack = &ctx->info[ctx->pack_perm[i]]; if (pack->bitmap_pos == BITMAP_POS_UNKNOWN) pack->bitmap_pos = 0; Fundamentally, we should be able to read information from an existing MIDX when generating a new one. But in practice the midx-write.c code assumes that we won't run into issues like the above with incongruent pack IDs, and often makes those assumptions in extremely subtle and fragile ways. Instead, let's avoid reading from an existing MIDX altogether, and stick with the pre-d6a8c58675 implementation. Harden against any regressions in this area by adding a test which demonstrates these issues. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Jul 31, 2024
When performing multi-pack reuse, reuse_partial_packfile_from_bitmap() is responsible for generating an array of bitmapped_pack structs from which to perform reuse. In the multi-pack case, we loop over the MIDXs packs and copy the result of calling `nth_bitmapped_pack()` to construct the list of reusable paths. But we may also want to do pack-reuse over a single pack, either because we only had one pack to perform reuse over (in the case of single-pack bitmaps), or because we explicitly asked to do single pack reuse even with a MIDX[^1]. When this is the case, the array we generate of reusable packs contains only a single element, which is either (a) the pack attached to the single-pack bitmap, or (b) the MIDX's preferred pack. In 795006f (pack-bitmap: gracefully handle missing BTMP chunks, 2024-04-15), we refactored the reuse_partial_packfile_from_bitmap() function and stopped assigning the pack_int_id field when reusing only the MIDX's preferred pack. This results in an uninitialized read down in try_partial_reuse() like so: ==7474==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x55c5cd191dde in try_partial_reuse pack-bitmap.c:1887:8 #1 0x55c5cd191dde in reuse_partial_packfile_from_bitmap_1 pack-bitmap.c:2001:8 #2 0x55c5cd191dde in reuse_partial_packfile_from_bitmap pack-bitmap.c:2105:3 #3 0x55c5cce0bd0e in get_object_list_from_bitmap builtin/pack-objects.c:4043:3 #4 0x55c5cce0bd0e in get_object_list builtin/pack-objects.c:4156:27 #5 0x55c5cce0bd0e in cmd_pack_objects builtin/pack-objects.c:4596:3 git#6 0x55c5ccc8fac8 in run_builtin git.c:474:11 which happens when try_partial_reuse() tries to call midx_pair_to_pack_pos() when it tries to reject cross-pack deltas. Avoid the uninitialized read by ensuring that the pack_int_id field is set in the single-pack reuse case by setting it to either the MIDX preferred pack's pack_int_id, or '-1', in the case of single-pack bitmaps. In the latter case, we never read the pack_int_id field, so the choice of '-1' is intentional as a "garbage in, garbage out" measure. Guard against further regressions in this area by adding a test which ensures that we do not throw out deltas from the preferred pack as "cross-pack" due to an uninitialized pack_int_id. [^1]: This can happen for a couple of reasons, either because the repository is configured with 'pack.allowPackReuse=(true|single)', or because the MIDX was generated prior to the introduction of the BTMP chunk, which contains information necessary to perform multi-pack reuse. Reported-by: Kyle Lippincott <spectral@google.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Jul 31, 2024
Memory sanitizer (msan) is detecting a use of an uninitialized variable (`size`) in `read_attr_from_index`: ==2268==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x5651f3416504 in read_attr_from_index git/attr.c:868:11 #1 0x5651f3415530 in read_attr git/attr.c #2 0x5651f3413d74 in bootstrap_attr_stack git/attr.c:968:6 #3 0x5651f3413d74 in prepare_attr_stack git/attr.c:1004:2 #4 0x5651f3413d74 in collect_some_attrs git/attr.c:1199:2 #5 0x5651f3413144 in git_check_attr git/attr.c:1345:2 git#6 0x5651f34728da in convert_attrs git/convert.c:1320:2 #7 0x5651f3473425 in would_convert_to_git_filter_fd git/convert.c:1373:2 git#8 0x5651f357a35e in index_fd git/object-file.c:2630:34 git#9 0x5651f357aa15 in index_path git/object-file.c:2657:7 git#10 0x5651f35db9d9 in add_to_index git/read-cache.c:766:7 git#11 0x5651f35dc170 in add_file_to_index git/read-cache.c:799:9 git#12 0x5651f321f9b2 in add_files git/builtin/add.c:346:7 git#13 0x5651f321f9b2 in cmd_add git/builtin/add.c:565:18 git#14 0x5651f321d327 in run_builtin git/git.c:474:11 git#15 0x5651f321bc9e in handle_builtin git/git.c:729:3 git#16 0x5651f321a792 in run_argv git/git.c:793:4 git#17 0x5651f321a792 in cmd_main git/git.c:928:19 git#18 0x5651f33dde1f in main git/common-main.c:62:11 The issue exists because `size` is an output parameter from `read_blob_data_from_index`, but it's only modified if `read_blob_data_from_index` returns non-NULL. The read of `size` when calling `read_attr_from_buf` unconditionally may read from an uninitialized value. `read_attr_from_buf` checks that `buf` is non-NULL before reading from `size`, but by then it's already too late: the uninitialized read will have happened already. Furthermore, there's no guarantee that the compiler won't reorder things so that it checks `size` before checking `!buf`. Make the call to `read_attr_from_buf` conditional on `buf` being non-NULL, ensuring that `size` is not read if it's never set. Signed-off-by: Kyle Lippincott <spectral@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Jul 31, 2024
Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Dec 5, 2024
It was recently reported that concurrent reads and writes may cause the reftable backend to segfault. The root cause of this is that we do not properly keep track of reftable readers across reloads. Suppose that you have a reftable iterator and then decide to reload the stack while iterating through the iterator. When the stack has been rewritten since we have created the iterator, then we would end up discarding a subset of readers that may still be in use by the iterator. The consequence is that we now try to reference deallocated memory, which of course segfaults. One way to trigger this is in t5616, where some background maintenance jobs have been leaking from one test into another. This leads to stack traces like the following one: + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin AddressSanitizer:DEADLYSIGNAL ================================================================= ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp 0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0) ==657994==The signal is caused by a READ memory access. #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29 #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170 #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194 #3 0x55f23e54e72e in block_iter_next reftable/block.c:398 #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240 #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355 git#6 0x55f23e5573dc in table_iter_next reftable/reader.c:339 #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69 git#8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123 git#9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172 git#10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175 git#11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464 git#12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13 git#13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452 git#14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623 git#15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659 git#16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133 git#17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432 git#18 0x55f23dba7764 in run_builtin git.c:484 git#19 0x55f23dba7764 in handle_builtin git.c:741 git#20 0x55f23dbab61e in run_argv git.c:805 git#21 0x55f23dbab61e in cmd_main git.c:1000 git#22 0x55f23dba4781 in main common-main.c:64 git#23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 git#24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360 git#25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27) While it is somewhat awkward that the maintenance processes survive tests in the first place, it is totally expected that reftables should work alright with concurrent writers. Seemingly they don't. The only underlying resource that we need to care about in this context is the reftable reader, which is responsible for reading a single table from disk. These readers get discarded immediately (unless reused) when calling `reftable_stack_reload()`, which is wrong. We can only close them once we know that there are no iterators using them anymore. Prepare for a fix by converting the reftable readers to be refcounted. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Dec 5, 2024
This one is a little bit more curious. In t6112, we have a test that exercises the `git rev-list --filter` option with invalid filters. We execute git-rev-list(1) via `test_must_fail`, which means that we check for leaks even though Git exits with an error code. This causes the following leak: Direct leak of 27 byte(s) in 1 object(s) allocated from: #0 0x5555555e6946 in realloc.part.0 lsan_interceptors.cpp.o #1 0x5555558fb4b6 in xrealloc wrapper.c:137:8 #2 0x5555558b6e06 in strbuf_grow strbuf.c:112:2 #3 0x5555558b7550 in strbuf_add strbuf.c:311:2 #4 0x5555557c1a88 in strbuf_addstr strbuf.h:310:2 #5 0x5555557c1d4c in parse_list_objects_filter list-objects-filter-options.c:261:3 git#6 0x555555885ead in handle_revision_pseudo_opt revision.c:2899:3 #7 0x555555884e20 in setup_revisions revision.c:3014:11 git#8 0x5555556c4b42 in cmd_rev_list builtin/rev-list.c:588:9 git#9 0x5555555ec5e3 in run_builtin git.c:483:11 git#10 0x5555555eb1e4 in handle_builtin git.c:749:13 git#11 0x5555555ec001 in run_argv git.c:819:4 git#12 0x5555555eaf94 in cmd_main git.c:954:19 git#13 0x5555556fd569 in main common-main.c:64:11 git#14 0x7ffff7ca714d in __libc_start_call_main (.../lib/libc.so.6+0x2a14d) git#15 0x7ffff7ca7208 in __libc_start_main@GLIBC_2.2.5 (.../libc.so.6+0x2a208) git#16 0x5555555ad064 in _start (git+0x59064) This leak is valid, as we call `die()` and do not clean up the memory at all. But what's curious is that this is the only leak reported, because we don't clean up any other allocated memory, either, and I have no idea why the leak sanitizer treats this buffer specially. In any case, we can work around the leak by shuffling things around a bit. Instead of calling `gently_parse_list_objects_filter()` and dying after we have modified the filter spec, we simply do so beforehand. Like this we don't allocate the buffer in the error case, which makes the reported leak go away. It's not pretty, but it manages to make t6112 leak free. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Dec 5, 2024
When running a dir-diff command that produces no diff, variables `wt_modified` and `tmp_modified` are used while uninitialized, causing: $ /home/smarchi/src/git/git-difftool --dir-diff master free(): invalid pointer [1] 334004 IOT instruction (core dumped) /home/smarchi/src/git/git-difftool --dir-diff master $ valgrind --track-origins=yes /home/smarchi/src/git/git-difftool --dir-diff master ... Invalid free() / delete / delete[] / realloc() at 0x48478EF: free (vg_replace_malloc.c:989) by 0x422CAC: hashmap_clear_ (hashmap.c:208) by 0x283830: run_dir_diff (difftool.c:667) by 0x284103: cmd_difftool (difftool.c:801) by 0x238E0F: run_builtin (git.c:484) by 0x2392B9: handle_builtin (git.c:750) by 0x2399BC: cmd_main (git.c:921) by 0x356FEF: main (common-main.c:64) Address 0x1ffefff180 is on thread 1's stack in frame #2, created by run_dir_diff (difftool.c:358) ... If taking any `goto finish` path before these variables are initialized, `hashmap_clear_and_free()` operates on uninitialized data, sometimes causing a crash. This regression was introduced in 7f795a1 (builtin/difftool: plug several trivial memory leaks, 2024-09-26). Fix it by initializing those variables with the `HASHMAP_INIT` macro. Add a test comparing the main branch to itself, resulting in no diff. Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Dec 5, 2024
When running t5601 with the leak checker enabled we can see a hang in our CI systems. This hang seems to be system-specific, as I cannot reproduce it on my own machine. As it turns out, the issue is in those testcases that exercise cloning of `~repo`-style paths. All of the testcases that hang eventually end up interpreting "repo" as the username and will call getpwnam(3p) with that username. That should of course be fine, and getpwnam(3p) should just return an error. But instead, the leak sanitizer seems to be recursing while handling a call to `free()` in the NSS modules: #0 0x00007ffff7fd98d5 in _dl_update_slotinfo (req_modid=1, new_gen=2) at ../elf/dl-tls.c:720 #1 0x00007ffff7fd9ac4 in update_get_addr (ti=0x7ffff7a91d80, gen=<optimized out>) at ../elf/dl-tls.c:916 #2 0x00007ffff7fdc85c in __tls_get_addr () at ../sysdeps/x86_64/tls_get_addr.S:55 #3 0x00007ffff7a27e04 in __lsan::GetAllocatorCache () at ../../../../src/libsanitizer/lsan/lsan_linux.cpp:27 #4 0x00007ffff7a2b33a in __lsan::Deallocate (p=0x0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:127 #5 __lsan::lsan_free (p=0x0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:220 ... #261505 0x00007ffff7fd99f2 in free (ptr=<optimized out>) at ../include/rtld-malloc.h:50 #261506 _dl_update_slotinfo (req_modid=1, new_gen=2) at ../elf/dl-tls.c:822 #261507 0x00007ffff7fd9ac4 in update_get_addr (ti=0x7ffff7a91d80, gen=<optimized out>) at ../elf/dl-tls.c:916 #261508 0x00007ffff7fdc85c in __tls_get_addr () at ../sysdeps/x86_64/tls_get_addr.S:55 #261509 0x00007ffff7a27e04 in __lsan::GetAllocatorCache () at ../../../../src/libsanitizer/lsan/lsan_linux.cpp:27 #261510 0x00007ffff7a2b33a in __lsan::Deallocate (p=0x5020000001e0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:127 #261511 __lsan::lsan_free (p=0x5020000001e0) at ../../../../src/libsanitizer/lsan/lsan_allocator.cpp:220 #261512 0x00007ffff793da25 in module_load (module=0x515000000280) at ./nss/nss_module.c:188 #261513 0x00007ffff793dee5 in __nss_module_load (module=0x515000000280) at ./nss/nss_module.c:302 #261514 __nss_module_get_function (module=0x515000000280, name=name@entry=0x7ffff79b9128 "getpwnam_r") at ./nss/nss_module.c:328 #261515 0x00007ffff793e741 in __GI___nss_lookup_function (fct_name=<optimized out>, ni=<optimized out>) at ./nss/nsswitch.c:137 #261516 __GI___nss_next2 (ni=ni@entry=0x7fffffffa458, fct_name=fct_name@entry=0x7ffff79b9128 "getpwnam_r", fct2_name=fct2_name@entry=0x0, fctp=fctp@entry=0x7fffffffa460, status=status@entry=0, all_values=all_values@entry=0) at ./nss/nsswitch.c:120 #261517 0x00007ffff794c6a7 in __getpwnam_r (name=name@entry=0x501000000060 "repo", resbuf=resbuf@entry=0x7ffff79fb320 <resbuf>, buffer=<optimized out>, buflen=buflen@entry=1024, result=result@entry=0x7fffffffa4b0) at ../nss/getXXbyYY_r.c:343 #261518 0x00007ffff794c4d8 in getpwnam (name=0x501000000060 "repo") at ../nss/getXXbyYY.c:140 #261519 0x00005555557e37ff in getpw_str (username=0x5020000001a1 "repo", len=4) at path.c:613 #261520 0x00005555557e3937 in interpolate_path (path=0x5020000001a0 "~repo", real_home=0) at path.c:654 #261521 0x00005555557e3aea in enter_repo (path=0x501000000040 "~repo", strict=0) at path.c:718 #261522 0x000055555568f0ba in cmd_upload_pack (argc=1, argv=0x502000000100, prefix=0x0, repo=0x0) at builtin/upload-pack.c:57 #261523 0x0000555555575ba8 in run_builtin (p=0x555555a20c98 <commands+3192>, argc=2, argv=0x502000000100, repo=0x555555a53b20 <the_repo>) at git.c:481 #261524 0x0000555555576067 in handle_builtin (args=0x7fffffffaab0) at git.c:742 #261525 0x000055555557678d in cmd_main (argc=2, argv=0x7fffffffac58) at git.c:912 #261526 0x00005555556963cd in main (argc=2, argv=0x7fffffffac58) at common-main.c:64 Note that this stack is more than 260000 function calls deep. Run under the debugger this will eventually segfault, but in our CI systems it seems like this just hangs forever. I assume that this is a bug either in the leak sanitizer or in glibc, as I cannot reproduce it on my machine. In any case, let's work around the bug for now by marking those tests with the "!SANITIZE_LEAK" prereq. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Jan 27, 2025
There's a race with LSan when spawning threads and one of the threads calls die(). We worked around one such problem with index-pack in the previous commit, but it exists in git-grep, too. You can see it with: make SANITIZE=leak THREAD_BARRIER_PTHREAD=YesOnLinux cd t ./t0003-attributes.sh --stress which fails pretty quickly with: ==git==4096424==ERROR: LeakSanitizer: detected memory leaks Direct leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7f906de14556 in realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98 #1 0x7f906dc9d2c1 in __pthread_getattr_np nptl/pthread_getattr_np.c:180 #2 0x7f906de2500d in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150 #3 0x7f906de25187 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:614 #4 0x7f906de17d18 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:53 #5 0x7f906de143a9 in ThreadStartFunc<false> ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:431 git#6 0x7f906dc9bf51 in start_thread nptl/pthread_create.c:447 #7 0x7f906dd1a677 in __clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78 As with the previous commit, we can fix this by inserting a barrier that makes sure all threads have finished their setup before continuing. But there's one twist in this case: the thread which calls die() is not one of the worker threads, but the main thread itself! So we need the main thread to wait in the barrier, too, until all threads have gotten to it. And thus we initialize the barrier for num_threads+1, to account for all of the worker threads plus the main one. If we then test as above, t0003 should run indefinitely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Jan 27, 2025
In 1b9e9be (csum-file.c: use unsafe SHA-1 implementation when available, 2024-09-26) we have converted our `struct hashfile` to use the unsafe SHA1 backend, which results in a significant speedup. One needs to be careful with how to use that structure now though because callers need to consistently use either the safe or unsafe variants of SHA1, as otherwise one can easily trigger corruption. As it turns out, we have one inconsistent usage in our tree because we directly initialize `struct hashfile_checkpoint::ctx` with the safe variant of SHA1, but end up writing to that context with the unsafe ones. This went unnoticed so far because our CI systems do not exercise different hash functions for these two backends, and consequently safe and unsafe variants are equivalent. But when using SHA1DC as safe and OpenSSL as unsafe backend this leads to a crash an t1050: ++ git -c core.compression=0 add large1 AddressSanitizer:DEADLYSIGNAL ================================================================= ==1367==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 (pc 0x7ffff7a01a99 bp 0x507000000db0 sp 0x7fffffff5690 T0) ==1367==The signal is caused by a READ memory access. ==1367==Hint: address points to the zero page. #0 0x7ffff7a01a99 in EVP_MD_CTX_copy_ex (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) #1 0x555555ddde56 in openssl_SHA1_Clone ../sha1/openssl.h:40:2 #2 0x555555dce2fc in git_hash_sha1_clone_unsafe ../object-file.c:123:2 #3 0x555555c2d5f8 in hashfile_checkpoint ../csum-file.c:211:2 #4 0x555555b9905d in deflate_blob_to_pack ../bulk-checkin.c:286:4 #5 0x555555b98ae9 in index_blob_bulk_checkin ../bulk-checkin.c:362:15 git#6 0x555555ddab62 in index_blob_stream ../object-file.c:2756:9 #7 0x555555dda420 in index_fd ../object-file.c:2778:9 git#8 0x555555ddad76 in index_path ../object-file.c:2796:7 git#9 0x555555e947f3 in add_to_index ../read-cache.c:771:7 git#10 0x555555e954a4 in add_file_to_index ../read-cache.c:804:9 git#11 0x5555558b5c39 in add_files ../builtin/add.c:355:7 git#12 0x5555558b412e in cmd_add ../builtin/add.c:578:18 git#13 0x555555b1f493 in run_builtin ../git.c:480:11 git#14 0x555555b1bfef in handle_builtin ../git.c:740:9 git#15 0x555555b1e6f4 in run_argv ../git.c:807:4 git#16 0x555555b1b87a in cmd_main ../git.c:947:19 git#17 0x5555561649e6 in main ../common-main.c:64:11 git#18 0x7ffff742a1fb in __libc_start_call_main (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4) git#19 0x7ffff742a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4) git#20 0x555555772c84 in _start (git+0x21ec84) ==1367==Register values: rax = 0x0000511000001080 rbx = 0x0000000000000000 rcx = 0x000000000000000c rdx = 0x0000000000000000 rdi = 0x0000000000000000 rsi = 0x0000507000000db0 rbp = 0x0000507000000db0 rsp = 0x00007fffffff5690 r8 = 0x0000000000000000 r9 = 0x0000000000000000 r10 = 0x0000000000000000 r11 = 0x00007ffff7a01a30 r12 = 0x0000000000000000 r13 = 0x00007fffffff6b38 r14 = 0x00007ffff7ffd000 r15 = 0x00005555563b9910 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) in EVP_MD_CTX_copy_ex ==1367==ABORTING ./test-lib.sh: line 1023: 1367 Aborted git $config add large1 error: last command exited with $?=134 not ok 4 - add with -c core.compression=0 Fix the issue by using the unsafe variant instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Jan 27, 2025
Same as with the preceding commit, git-fast-import(1) is using the safe variant to initialize a hashfile checkpoint. This leads to a segfault when passing the checkpoint into the hashfile subsystem because it would use the unsafe variants instead: ++ git --git-dir=R/.git fast-import --big-file-threshold=1 AddressSanitizer:DEADLYSIGNAL ================================================================= ==577126==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 (pc 0x7ffff7a01a99 bp 0x5070000009c0 sp 0x7fffffff5b30 T0) ==577126==The signal is caused by a READ memory access. ==577126==Hint: address points to the zero page. #0 0x7ffff7a01a99 in EVP_MD_CTX_copy_ex (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) #1 0x555555ddde56 in openssl_SHA1_Clone ../sha1/openssl.h:40:2 #2 0x555555dce2fc in git_hash_sha1_clone_unsafe ../object-file.c:123:2 #3 0x555555c2d5f8 in hashfile_checkpoint ../csum-file.c:211:2 #4 0x5555559647d1 in stream_blob ../builtin/fast-import.c:1110:2 #5 0x55555596247b in parse_and_store_blob ../builtin/fast-import.c:2031:3 git#6 0x555555967f91 in file_change_m ../builtin/fast-import.c:2408:5 #7 0x55555595d8a2 in parse_new_commit ../builtin/fast-import.c:2768:4 git#8 0x55555595bb7a in cmd_fast_import ../builtin/fast-import.c:3614:4 git#9 0x555555b1f493 in run_builtin ../git.c:480:11 git#10 0x555555b1bfef in handle_builtin ../git.c:740:9 git#11 0x555555b1e6f4 in run_argv ../git.c:807:4 git#12 0x555555b1b87a in cmd_main ../git.c:947:19 git#13 0x5555561649e6 in main ../common-main.c:64:11 git#14 0x7ffff742a1fb in __libc_start_call_main (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4) git#15 0x7ffff742a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4) git#16 0x555555772c84 in _start (git+0x21ec84) ==577126==Register values: rax = 0x0000511000000cc0 rbx = 0x0000000000000000 rcx = 0x000000000000000c rdx = 0x0000000000000000 rdi = 0x0000000000000000 rsi = 0x00005070000009c0 rbp = 0x00005070000009c0 rsp = 0x00007fffffff5b30 r8 = 0x0000000000000000 r9 = 0x0000000000000000 r10 = 0x0000000000000000 r11 = 0x00007ffff7a01a30 r12 = 0x0000000000000000 r13 = 0x00007fffffff6b60 r14 = 0x00007ffff7ffd000 r15 = 0x00005555563b9910 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) in EVP_MD_CTX_copy_ex ==577126==ABORTING ./test-lib.sh: line 1039: 577126 Aborted git --git-dir=R/.git fast-import --big-file-threshold=1 < input error: last command exited with $?=134 not ok 167 - R: blob bigger than threshold The segfault is only exposed in case the unsafe and safe backends are different from one another. Fix the issue by initializing the context with the unsafe SHA1 variant. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Jan 27, 2025
Our CI jobs sometimes see false positive leaks like this: ================================================================= ==3904583==ERROR: LeakSanitizer: detected memory leaks Direct leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98 #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180 #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150 #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598 #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51 #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440 git#6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444 #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 This is not a leak in our code, but appears to be a race between one thread calling exit() while another one is in LSan's stack setup code. You can reproduce it easily by running t0003 or t5309 with --stress (these trigger it because of the threading in git-grep and index-pack respectively). This may be a bug in LSan, but regardless of whether it is eventually fixed, it is useful to work around it so that we stop seeing these false positives. We can recognize it by the mention of the sanitizer functions in the DEDUP_TOKEN line. With this patch, the scripts mentioned above should run with --stress indefinitely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Feb 18, 2025
When trying to create a Unix socket in a path that exceeds the maximum socket name length we try to first change the directory into the parent folder before creating the socket to reduce the length of the name. When this fails we error out of `unix_sockaddr_init()` with an error code, which indicates to the caller that the context has not been initialized. Consequently, they don't release that context. This leads to a memory leak: when we have already populated the context with the original directory that we need to chdir(3p) back into, but then the chdir(3p) into the socket's parent directory fails, then we won't release the original directory's path. The leak is exposed by t0301, but only when running tests in a directory hierarchy whose path is long enough to make the socket name length exceed the maximum socket name length: Direct leak of 129 byte(s) in 1 object(s) allocated from: #0 0x5555555e85c6 in realloc.part.0 lsan_interceptors.cpp.o #1 0x55555590e3d6 in xrealloc ../wrapper.c:140:8 #2 0x5555558c8fc6 in strbuf_grow ../strbuf.c:114:2 #3 0x5555558cacab in strbuf_getcwd ../strbuf.c:605:3 #4 0x555555923ff6 in unix_sockaddr_init ../unix-socket.c:65:7 #5 0x555555923e42 in unix_stream_connect ../unix-socket.c:84:6 git#6 0x55555562a984 in send_request ../builtin/credential-cache.c:46:11 #7 0x55555562a89e in do_cache ../builtin/credential-cache.c:108:6 git#8 0x55555562a655 in cmd_credential_cache ../builtin/credential-cache.c:178:3 git#9 0x555555700547 in run_builtin ../git.c:480:11 git#10 0x5555556ff0e0 in handle_builtin ../git.c:740:9 git#11 0x5555556ffee8 in run_argv ../git.c:807:4 git#12 0x5555556fee6b in cmd_main ../git.c:947:19 git#13 0x55555593f689 in main ../common-main.c:64:11 git#14 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0) git#15 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0) git#16 0x5555555ad1d4 in _start (git+0x591d4) DEDUP_TOKEN: ___interceptor_realloc.part.0--xrealloc--strbuf_grow--strbuf_getcwd--unix_sockaddr_init--unix_stream_connect--send_request--do_cache--cmd_credential_cache--run_builtin--handle_builtin--run_argv--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start SUMMARY: LeakSanitizer: 129 byte(s) leaked in 1 allocation(s). Fix this leak. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Feb 18, 2025
We don't free the result of `remote_default_branch()`, leading to a memory leak. This leak is exposed by t9211, but only when run with Meson with the `-Db_sanitize=leak` option: Direct leak of 5 byte(s) in 1 object(s) allocated from: #0 0x5555555cfb93 in malloc (scalar+0x7bb93) #1 0x5555556b05c2 in do_xmalloc ../wrapper.c:55:8 #2 0x5555556b06c4 in do_xmallocz ../wrapper.c:89:8 #3 0x5555556b0656 in xmallocz ../wrapper.c:97:9 #4 0x5555556b0728 in xmemdupz ../wrapper.c:113:16 #5 0x5555556b07a7 in xstrndup ../wrapper.c:119:9 git#6 0x5555555d3a4b in remote_default_branch ../scalar.c:338:14 #7 0x5555555d20e6 in cmd_clone ../scalar.c:493:28 git#8 0x5555555d196b in cmd_main ../scalar.c:992:14 git#9 0x5555557c4059 in main ../common-main.c:64:11 git#10 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0) git#11 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0) git#12 0x555555592054 in _start (scalar+0x3e054) DEDUP_TOKEN: __interceptor_malloc--do_xmalloc--do_xmallocz--xmallocz--xmemdupz--xstrndup--remote_default_branch--cmd_clone--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start SUMMARY: LeakSanitizer: 5 byte(s) leaked in 1 allocation(s). As the `branch` variable may contain a string constant obtained from parsing command line arguments we cannot free the leaking variable directly. Instead, introduce a new `branch_to_free` variable that only ever gets assigned the allocated string and free that one to plug the leak. It is unclear why the leak isn't flagged when running the test via our Makefile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
chriscool
pushed a commit
that referenced
this pull request
Mar 9, 2025
Submodule merges are, in general, similar to other merges based on oid three-way-merge. When a conflict happens, however, Git has two special cases (introduced in 68d03e4) on handling the conflict before yielding it to the user. From the merge-ort and merge-recursive sources: - "Case #1: a is contained in b or vice versa": both strategies try to perform a fast-forward in the submodules if the commit referred by the conflicted submodule is descendant of another; - "Case #2: There are one or more merges that contain a and b in the submodule. If there is only one, then present it as a suggestion to the user, but leave it marked unmerged so the user needs to confirm the resolution." Add a small paragraph on merge-strategies.adoc describing this behavior. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.