Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repack -ad: fix after fetch --prune in a shallow repository #9

Closed
wants to merge 3 commits into from

Conversation

dscho
Copy link
Member

@dscho dscho commented Jul 13, 2018

Under certain circumstances, commits that were reachable can be made unreachable, e.g. via git fetch --prune. These commits might have been packed already, in which case git repack -adlf will just drop them without giving them the usual grace period before an eventual git prune (via git gc) prunes them.

This is a problem when the shallow file still lists them, which is the reason why git prune edits that file. And with the proposed changes, git repack -ad will now do the same.

Reported by Alejandro Pauly.

Changes since v3:

  • Made the regression test less confusing with regards to tags that might need to be dereferenced.
  • Introduced new global constants PRUNE_SHOW_ONLY and PRUNE_QUICK instead of extending the signature of prune_shallow() with Boolean parameters.
  • While at it, renamed the file-local constant from QUICK_PRUNE to QUICK.
  • Replaced the lookup_commit() && parse_commit() cadence (that wants to test for the existence of a commit) to has_object_file(), for ease of readability, and also to make it more obvious how to add a call to is_promisor_object() if we ever figure out that that would be necessary.

Changes since v2:

  • Fixed a typo in the last commit message.
  • Added an explanation to the last commit message why we do not simply skip non-existing shallow commits at fetch time.
  • Introduced a new, "quick prune" mode where prune_shallow() does not try to drop unreachable commits, but only non-existing ones.
  • Rebased to current master because there were too many merge conflicts for my liking otherwise.

Changes since v1:

  • Also trigger prune_shallow() when --unpack-unreachable=<approxidate> was passed to git repack.
  • No need to trigger prune_shallow() when git repack was called with -k.

Cc: Duy Nguyen pclouds@gmail.com, Jeff King peff@peff.net

@dscho
Copy link
Member Author

dscho commented Jul 13, 2018

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 13, 2018

An error occurred while submitting:

Error: Branch 6f41d1987094bd397b2c8e1d542d2fe4833751d9 is not rebased to upstream/master

@dscho dscho force-pushed the shallow-and-fetch-prune branch from 6f41d19 to b4e01a9 Compare July 13, 2018 20:06
@dscho
Copy link
Member Author

dscho commented Jul 13, 2018

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 13, 2018

Submitted as pull.9.git.gitgitgadget@gmail.com

@dscho dscho force-pushed the shallow-and-fetch-prune branch from b4e01a9 to c7ee6e0 Compare July 14, 2018 22:05
@dscho
Copy link
Member Author

dscho commented Jul 17, 2018

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 17, 2018

Submitted as pull.9.v2.git.gitgitgadget@gmail.com

@dscho dscho force-pushed the shallow-and-fetch-prune branch 3 times, most recently from 8361696 to 1f9ff57 Compare October 22, 2018 20:20
@dscho
Copy link
Member Author

dscho commented Oct 22, 2018

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 22, 2018

Submitted as pull.9.v3.git.gitgitgadget@gmail.com

A `git fetch --prune` can turn previously-reachable objects unreachable,
even commits that are in the `shallow` list. A subsequent `git repack
-ad` will then unceremoniously drop those unreachable commits, and the
`shallow` list will become stale. This means that when we try to fetch
with a larger `--depth` the next time, we may end up with:

	fatal: error in object: unshallow <commit-hash>

Reported by Alejandro Pauly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho force-pushed the shallow-and-fetch-prune branch from 1f9ff57 to 59ca386 Compare October 24, 2018 08:39
dscho added 2 commits October 24, 2018 17:51
The `prune_shallow()` function wants a full reachability check to be
completed before it goes to work, to ensure that all unreachable entries
are removed from the shallow file.

However, in the upcoming patch we do not even want to go that far. We
really only need to remove entries corresponding to pruned commits, i.e.
to commits that no longer exist.

Let's support that use case.

Rather than extending the signature of `prune_shallow()` to accept
another Boolean, let's turn it into a bit field and declare constants,
for readability.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
`git repack` can drop unreachable commits without further warning,
making the corresponding entries in `.git/shallow` invalid, which causes
serious problems when deepening the branches.

One scenario where unreachable commits are dropped by `git repack` is
when a `git fetch --prune` (or even a `git fetch` when a ref was
force-pushed in the meantime) can make a commit unreachable that was
reachable before.

Therefore it is not safe to assume that a `git repack -adlf` will keep
unreachable commits alone (under the assumption that they had not been
packed in the first place, which is an assumption at least some of Git's
code seems to make).

This is particularly important to keep in mind when looking at the
`.git/shallow` file: if any commits listed in that file become
unreachable, it is not a problem, but if they go missing, it *is* a
problem. One symptom of this problem is that a deepening fetch may now
fail with

	fatal: error in object: unshallow <commit-hash>

To avoid this problem, let's prune the shallow list in `git repack` when
the `-d` option is passed, unless `-A` is passed, too (which would force
the now-unreachable objects to be turned into loose objects instead of
being deleted). Additionally, we also need to take `--keep-reachable`
and `--unpack-unreachable=<date>` into account.

Note: an alternative solution discussed during the review of this patch
was to teach `git fetch` to simply ignore entries in .git/shallow if the
corresponding commits do not exist locally. A quick test, however,
revealed that the .git/shallow file is written during a shallow *clone*,
in which case the commits do not exist, either, but the "shallow" line
*does* need to be sent. Therefore, this approach would be a lot more
finicky than the approach presented by the this patch.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho force-pushed the shallow-and-fetch-prune branch from 59ca386 to 2858bc8 Compare October 24, 2018 15:52
@dscho
Copy link
Member Author

dscho commented Oct 24, 2018

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 24, 2018

Submitted as pull.9.v4.git.gitgitgadget@gmail.com

Copy link
Member Author

dscho commented Nov 1, 2018

This branch is now known as js/shallow-and-fetch-prune.

Copy link
Member Author

dscho commented Nov 1, 2018

This patch series was integrated into pu via git@aa4b1fe.

Copy link
Member Author

dscho commented Nov 1, 2018

This patch series was integrated into next via git@93b7196.

This comment has been minimized.

1 similar comment

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

gitgitgadget bot pushed a commit that referenced this pull request Nov 27, 2023
gitgitgadget bot pushed a commit that referenced this pull request Jan 30, 2024
gitgitgadget bot pushed a commit that referenced this pull request Feb 28, 2024
gitgitgadget bot pushed a commit that referenced this pull request Mar 28, 2024
gitgitgadget bot pushed a commit that referenced this pull request Apr 24, 2024
gitgitgadget bot pushed a commit that referenced this pull request May 23, 2024
gitgitgadget bot pushed a commit that referenced this pull request Jun 17, 2024
Memory sanitizer (msan) is detecting a use of an uninitialized variable
(`size`) in `read_attr_from_index`:

    ==2268==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x5651f3416504 in read_attr_from_index git/attr.c:868:11
    #1 0x5651f3415530 in read_attr git/attr.c
    #2 0x5651f3413d74 in bootstrap_attr_stack git/attr.c:968:6
    #3 0x5651f3413d74 in prepare_attr_stack git/attr.c:1004:2
    #4 0x5651f3413d74 in collect_some_attrs git/attr.c:1199:2
    #5 0x5651f3413144 in git_check_attr git/attr.c:1345:2
    #6 0x5651f34728da in convert_attrs git/convert.c:1320:2
    #7 0x5651f3473425 in would_convert_to_git_filter_fd git/convert.c:1373:2
    #8 0x5651f357a35e in index_fd git/object-file.c:2630:34
    #9 0x5651f357aa15 in index_path git/object-file.c:2657:7
    #10 0x5651f35db9d9 in add_to_index git/read-cache.c:766:7
    #11 0x5651f35dc170 in add_file_to_index git/read-cache.c:799:9
    #12 0x5651f321f9b2 in add_files git/builtin/add.c:346:7
    #13 0x5651f321f9b2 in cmd_add git/builtin/add.c:565:18
    #14 0x5651f321d327 in run_builtin git/git.c:474:11
    #15 0x5651f321bc9e in handle_builtin git/git.c:729:3
    #16 0x5651f321a792 in run_argv git/git.c:793:4
    #17 0x5651f321a792 in cmd_main git/git.c:928:19
    #18 0x5651f33dde1f in main git/common-main.c:62:11

The issue exists because `size` is an output parameter from
`read_blob_data_from_index`, but it's only modified if
`read_blob_data_from_index` returns non-NULL. The read of `size` when
calling `read_attr_from_buf` unconditionally may read from an
uninitialized value. `read_attr_from_buf` checks that `buf` is non-NULL
before reading from `size`, but by then it's already too late: the
uninitialized read will have happened already. Furthermore, there's no
guarantee that the compiler won't reorder things so that it checks
`size` before checking `!buf`.

Make the call to `read_attr_from_buf` conditional on `buf` being
non-NULL, ensuring that `size` is not read if it's never set.

Signed-off-by: Kyle Lippincott <spectral@google.com>
gitgitgadget bot pushed a commit that referenced this pull request Jun 17, 2024
Memory sanitizer (msan) is detecting a use of an uninitialized variable
(`size`) in `read_attr_from_index`:

    ==2268==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x5651f3416504 in read_attr_from_index git/attr.c:868:11
    #1 0x5651f3415530 in read_attr git/attr.c
    #2 0x5651f3413d74 in bootstrap_attr_stack git/attr.c:968:6
    #3 0x5651f3413d74 in prepare_attr_stack git/attr.c:1004:2
    #4 0x5651f3413d74 in collect_some_attrs git/attr.c:1199:2
    #5 0x5651f3413144 in git_check_attr git/attr.c:1345:2
    #6 0x5651f34728da in convert_attrs git/convert.c:1320:2
    #7 0x5651f3473425 in would_convert_to_git_filter_fd git/convert.c:1373:2
    #8 0x5651f357a35e in index_fd git/object-file.c:2630:34
    #9 0x5651f357aa15 in index_path git/object-file.c:2657:7
    #10 0x5651f35db9d9 in add_to_index git/read-cache.c:766:7
    #11 0x5651f35dc170 in add_file_to_index git/read-cache.c:799:9
    #12 0x5651f321f9b2 in add_files git/builtin/add.c:346:7
    #13 0x5651f321f9b2 in cmd_add git/builtin/add.c:565:18
    #14 0x5651f321d327 in run_builtin git/git.c:474:11
    #15 0x5651f321bc9e in handle_builtin git/git.c:729:3
    #16 0x5651f321a792 in run_argv git/git.c:793:4
    #17 0x5651f321a792 in cmd_main git/git.c:928:19
    #18 0x5651f33dde1f in main git/common-main.c:62:11

The issue exists because `size` is an output parameter from
`read_blob_data_from_index`, but it's only modified if
`read_blob_data_from_index` returns non-NULL. The read of `size` when
calling `read_attr_from_buf` unconditionally may read from an
uninitialized value. `read_attr_from_buf` checks that `buf` is non-NULL
before reading from `size`, but by then it's already too late: the
uninitialized read will have happened already. Furthermore, there's no
guarantee that the compiler won't reorder things so that it checks
`size` before checking `!buf`.

Make the call to `read_attr_from_buf` conditional on `buf` being
non-NULL, ensuring that `size` is not read if it's never set.

Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitgitgadget bot pushed a commit that referenced this pull request Jun 29, 2024
gitgitgadget bot pushed a commit that referenced this pull request Jul 29, 2024
gitgitgadget bot pushed a commit that referenced this pull request Aug 19, 2024
It was recently reported that concurrent reads and writes may cause the
reftable backend to segfault. The root cause of this is that we do not
properly keep track of reftable readers across reloads.

Suppose that you have a reftable iterator and then decide to reload the
stack while iterating through the iterator. When the stack has been
rewritten since we have created the iterator, then we would end up
discarding a subset of readers that may still be in use by the iterator.
The consequence is that we now try to reference deallocated memory,
which of course segfaults.

One way to trigger this is in t5616, where some background maintenance
jobs have been leaking from one test into another. This leads to stack
traces like the following one:

  + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin
  AddressSanitizer:DEADLYSIGNAL
  =================================================================
  ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp
0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0)
  ==657994==The signal is caused by a READ memory access.
      #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29
      #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170
      #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194
      #3 0x55f23e54e72e in block_iter_next reftable/block.c:398
      #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240
      #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355
      #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339
      #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69
      #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123
      #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172
      #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175
      #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464
      #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13
      #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452
      #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623
      #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659
      #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133
      #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432
      #18 0x55f23dba7764 in run_builtin git.c:484
      #19 0x55f23dba7764 in handle_builtin git.c:741
      #20 0x55f23dbab61e in run_argv git.c:805
      #21 0x55f23dbab61e in cmd_main git.c:1000
      #22 0x55f23dba4781 in main common-main.c:64
      #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
      #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360
      #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27)

While it is somewhat awkward that the maintenance processes survive
tests in the first place, it is totally expected that reftables should
work alright with concurrent writers. Seemingly they don't.

The only underlying resource that we need to care about in this context
is the reftable reader, which is responsible for reading a single table
from disk. These readers get discarded immediately (unless reused) when
calling `reftable_stack_reload()`, which is wrong. We can only close
them once we know that there are no iterators using them anymore.

Prepare for a fix by converting the reftable readers to be refcounted.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitgitgadget bot pushed a commit that referenced this pull request Aug 22, 2024
It was recently reported that concurrent reads and writes may cause the
reftable backend to segfault. The root cause of this is that we do not
properly keep track of reftable readers across reloads.

Suppose that you have a reftable iterator and then decide to reload the
stack while iterating through the iterator. When the stack has been
rewritten since we have created the iterator, then we would end up
discarding a subset of readers that may still be in use by the iterator.
The consequence is that we now try to reference deallocated memory,
which of course segfaults.

One way to trigger this is in t5616, where some background maintenance
jobs have been leaking from one test into another. This leads to stack
traces like the following one:

  + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin
  AddressSanitizer:DEADLYSIGNAL
  =================================================================
  ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp
0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0)
  ==657994==The signal is caused by a READ memory access.
      #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29
      #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170
      #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194
      #3 0x55f23e54e72e in block_iter_next reftable/block.c:398
      #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240
      #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355
      #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339
      #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69
      #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123
      #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172
      #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175
      #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464
      #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13
      #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452
      #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623
      #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659
      #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133
      #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432
      #18 0x55f23dba7764 in run_builtin git.c:484
      #19 0x55f23dba7764 in handle_builtin git.c:741
      #20 0x55f23dbab61e in run_argv git.c:805
      #21 0x55f23dbab61e in cmd_main git.c:1000
      #22 0x55f23dba4781 in main common-main.c:64
      #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
      #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360
      #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27)

While it is somewhat awkward that the maintenance processes survive
tests in the first place, it is totally expected that reftables should
work alright with concurrent writers. Seemingly they don't.

The only underlying resource that we need to care about in this context
is the reftable reader, which is responsible for reading a single table
from disk. These readers get discarded immediately (unless reused) when
calling `reftable_stack_reload()`, which is wrong. We can only close
them once we know that there are no iterators using them anymore.

Prepare for a fix by converting the reftable readers to be refcounted.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitgitgadget bot pushed a commit that referenced this pull request Aug 23, 2024
It was recently reported that concurrent reads and writes may cause the
reftable backend to segfault. The root cause of this is that we do not
properly keep track of reftable readers across reloads.

Suppose that you have a reftable iterator and then decide to reload the
stack while iterating through the iterator. When the stack has been
rewritten since we have created the iterator, then we would end up
discarding a subset of readers that may still be in use by the iterator.
The consequence is that we now try to reference deallocated memory,
which of course segfaults.

One way to trigger this is in t5616, where some background maintenance
jobs have been leaking from one test into another. This leads to stack
traces like the following one:

  + git -c protocol.version=0 -C pc1 fetch --filter=blob:limit=29999 --refetch origin
  AddressSanitizer:DEADLYSIGNAL
  =================================================================
  ==657994==ERROR: AddressSanitizer: SEGV on unknown address 0x7fa0f0ec6089 (pc 0x55f23e52ddf9 bp
0x7ffe7bfa1700 sp 0x7ffe7bfa1700 T0)
  ==657994==The signal is caused by a READ memory access.
      #0 0x55f23e52ddf9 in get_var_int reftable/record.c:29
      #1 0x55f23e53295e in reftable_decode_keylen reftable/record.c:170
      #2 0x55f23e532cc0 in reftable_decode_key reftable/record.c:194
      #3 0x55f23e54e72e in block_iter_next reftable/block.c:398
      #4 0x55f23e5573dc in table_iter_next_in_block reftable/reader.c:240
      #5 0x55f23e5573dc in table_iter_next reftable/reader.c:355
      #6 0x55f23e5573dc in table_iter_next reftable/reader.c:339
      #7 0x55f23e551283 in merged_iter_advance_subiter reftable/merged.c:69
      #8 0x55f23e55169e in merged_iter_next_entry reftable/merged.c:123
      #9 0x55f23e55169e in merged_iter_next_void reftable/merged.c:172
      #10 0x55f23e537625 in reftable_iterator_next_ref reftable/generic.c:175
      #11 0x55f23e2cf9c6 in reftable_ref_iterator_advance refs/reftable-backend.c:464
      #12 0x55f23e2d996e in ref_iterator_advance refs/iterator.c:13
      #13 0x55f23e2d996e in do_for_each_ref_iterator refs/iterator.c:452
      #14 0x55f23dca6767 in get_ref_map builtin/fetch.c:623
      #15 0x55f23dca6767 in do_fetch builtin/fetch.c:1659
      #16 0x55f23dca6767 in fetch_one builtin/fetch.c:2133
      #17 0x55f23dca6767 in cmd_fetch builtin/fetch.c:2432
      #18 0x55f23dba7764 in run_builtin git.c:484
      #19 0x55f23dba7764 in handle_builtin git.c:741
      #20 0x55f23dbab61e in run_argv git.c:805
      #21 0x55f23dbab61e in cmd_main git.c:1000
      #22 0x55f23dba4781 in main common-main.c:64
      #23 0x7fa0f063fc89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
      #24 0x7fa0f063fd44 in __libc_start_main_impl ../csu/libc-start.c:360
      #25 0x55f23dba6ad0 in _start (git+0xadfad0) (BuildId: 803b2b7f59beb03d7849fb8294a8e2145dd4aa27)

While it is somewhat awkward that the maintenance processes survive
tests in the first place, it is totally expected that reftables should
work alright with concurrent writers. Seemingly they don't.

The only underlying resource that we need to care about in this context
is the reftable reader, which is responsible for reading a single table
from disk. These readers get discarded immediately (unless reused) when
calling `reftable_stack_reload()`, which is wrong. We can only close
them once we know that there are no iterators using them anymore.

Prepare for a fix by converting the reftable readers to be refcounted.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitgitgadget bot pushed a commit that referenced this pull request Aug 23, 2024
gitgitgadget bot pushed a commit that referenced this pull request Sep 23, 2024
gitgitgadget bot pushed a commit that referenced this pull request Oct 5, 2024
The incremental MIDX bitmap work was done prior to 9d4855e
(midx-write: fix leaking buffer, 2024-09-30), and causes test failures
in t5334 in a post-9d4855eef3 world.

The leak looks like:

    Direct leak of 264 byte(s) in 1 object(s) allocated from:
        #0 0x7f6bcd87eaca in calloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:90
        #1 0x55ad1428e8a4 in xcalloc wrapper.c:151
        #2 0x55ad14199e16 in prepare_midx_bitmap_git pack-bitmap.c:742
        #3 0x55ad14199447 in open_midx_bitmap_1 pack-bitmap.c:507
        #4 0x55ad14199cca in open_midx_bitmap pack-bitmap.c:704
        #5 0x55ad14199d44 in open_bitmap pack-bitmap.c:717
        #6 0x55ad14199dc2 in prepare_bitmap_git pack-bitmap.c:733
        #7 0x55ad1419e496 in test_bitmap_walk pack-bitmap.c:2698
        #8 0x55ad14047b0b in cmd_rev_list builtin/rev-list.c:629
        #9 0x55ad13f71cd6 in run_builtin git.c:487
        #10 0x55ad13f72132 in handle_builtin git.c:756
        #11 0x55ad13f72380 in run_argv git.c:826
        #12 0x55ad13f728f4 in cmd_main git.c:961
        #13 0x55ad1407d3ae in main common-main.c:64
        #14 0x7f6bcd5f0c89 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
        #15 0x7f6bcd5f0d44 in __libc_start_main_impl ../csu/libc-start.c:360
        #16 0x55ad13f6ff90 in _start (git+0x1ef90) (BuildId: 3e63cdd415f1d185b21da3035cb48332510dddce)

, and is a result of us not freeing the resources corresponding to the
bitmap's base layer, if one was present.

Rectify that leak by calling the newly-introduced free_bitmap_index()
function on the base layer to ensure that its resources are also freed.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitgitgadget bot pushed a commit that referenced this pull request Oct 18, 2024
This one is a little bit more curious. In t6112, we have a test that
exercises the `git rev-list --filter` option with invalid filters. We
execute git-rev-list(1) via `test_must_fail`, which means that we check
for leaks even though Git exits with an error code. This causes the
following leak:

    Direct leak of 27 byte(s) in 1 object(s) allocated from:
        #0 0x5555555e6946 in realloc.part.0 lsan_interceptors.cpp.o
        #1 0x5555558fb4b6 in xrealloc wrapper.c:137:8
        #2 0x5555558b6e06 in strbuf_grow strbuf.c:112:2
        #3 0x5555558b7550 in strbuf_add strbuf.c:311:2
        #4 0x5555557c1a88 in strbuf_addstr strbuf.h:310:2
        #5 0x5555557c1d4c in parse_list_objects_filter list-objects-filter-options.c:261:3
        #6 0x555555885ead in handle_revision_pseudo_opt revision.c:2899:3
        #7 0x555555884e20 in setup_revisions revision.c:3014:11
        #8 0x5555556c4b42 in cmd_rev_list builtin/rev-list.c:588:9
        #9 0x5555555ec5e3 in run_builtin git.c:483:11
        #10 0x5555555eb1e4 in handle_builtin git.c:749:13
        #11 0x5555555ec001 in run_argv git.c:819:4
        #12 0x5555555eaf94 in cmd_main git.c:954:19
        #13 0x5555556fd569 in main common-main.c:64:11
        #14 0x7ffff7ca714d in __libc_start_call_main (.../lib/libc.so.6+0x2a14d)
        #15 0x7ffff7ca7208 in __libc_start_main@GLIBC_2.2.5 (.../libc.so.6+0x2a208)
        #16 0x5555555ad064 in _start (git+0x59064)

This leak is valid, as we call `die()` and do not clean up the memory at
all. But what's curious is that this is the only leak reported, because
we don't clean up any other allocated memory, either, and I have no idea
why the leak sanitizer treats this buffer specially.

In any case, we can work around the leak by shuffling things around a
bit. Instead of calling `gently_parse_list_objects_filter()` and dying
after we have modified the filter spec, we simply do so beforehand. Like
this we don't allocate the buffer in the error case, which makes the
reported leak go away.

It's not pretty, but it manages to make t6112 leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
gitgitgadget bot pushed a commit that referenced this pull request Oct 21, 2024
This one is a little bit more curious. In t6112, we have a test that
exercises the `git rev-list --filter` option with invalid filters. We
execute git-rev-list(1) via `test_must_fail`, which means that we check
for leaks even though Git exits with an error code. This causes the
following leak:

    Direct leak of 27 byte(s) in 1 object(s) allocated from:
        #0 0x5555555e6946 in realloc.part.0 lsan_interceptors.cpp.o
        #1 0x5555558fb4b6 in xrealloc wrapper.c:137:8
        #2 0x5555558b6e06 in strbuf_grow strbuf.c:112:2
        #3 0x5555558b7550 in strbuf_add strbuf.c:311:2
        #4 0x5555557c1a88 in strbuf_addstr strbuf.h:310:2
        #5 0x5555557c1d4c in parse_list_objects_filter list-objects-filter-options.c:261:3
        #6 0x555555885ead in handle_revision_pseudo_opt revision.c:2899:3
        #7 0x555555884e20 in setup_revisions revision.c:3014:11
        #8 0x5555556c4b42 in cmd_rev_list builtin/rev-list.c:588:9
        #9 0x5555555ec5e3 in run_builtin git.c:483:11
        #10 0x5555555eb1e4 in handle_builtin git.c:749:13
        #11 0x5555555ec001 in run_argv git.c:819:4
        #12 0x5555555eaf94 in cmd_main git.c:954:19
        #13 0x5555556fd569 in main common-main.c:64:11
        #14 0x7ffff7ca714d in __libc_start_call_main (.../lib/libc.so.6+0x2a14d)
        #15 0x7ffff7ca7208 in __libc_start_main@GLIBC_2.2.5 (.../libc.so.6+0x2a208)
        #16 0x5555555ad064 in _start (git+0x59064)

This leak is valid, as we call `die()` and do not clean up the memory at
all. But what's curious is that this is the only leak reported, because
we don't clean up any other allocated memory, either, and I have no idea
why the leak sanitizer treats this buffer specially.

In any case, we can work around the leak by shuffling things around a
bit. Instead of calling `gently_parse_list_objects_filter()` and dying
after we have modified the filter spec, we simply do so beforehand. Like
this we don't allocate the buffer in the error case, which makes the
reported leak go away.

It's not pretty, but it manages to make t6112 leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
gitgitgadget bot pushed a commit that referenced this pull request Oct 22, 2024
gitgitgadget bot pushed a commit that referenced this pull request Nov 5, 2024
This one is a little bit more curious. In t6112, we have a test that
exercises the `git rev-list --filter` option with invalid filters. We
execute git-rev-list(1) via `test_must_fail`, which means that we check
for leaks even though Git exits with an error code. This causes the
following leak:

    Direct leak of 27 byte(s) in 1 object(s) allocated from:
        #0 0x5555555e6946 in realloc.part.0 lsan_interceptors.cpp.o
        #1 0x5555558fb4b6 in xrealloc wrapper.c:137:8
        #2 0x5555558b6e06 in strbuf_grow strbuf.c:112:2
        #3 0x5555558b7550 in strbuf_add strbuf.c:311:2
        #4 0x5555557c1a88 in strbuf_addstr strbuf.h:310:2
        #5 0x5555557c1d4c in parse_list_objects_filter list-objects-filter-options.c:261:3
        #6 0x555555885ead in handle_revision_pseudo_opt revision.c:2899:3
        #7 0x555555884e20 in setup_revisions revision.c:3014:11
        #8 0x5555556c4b42 in cmd_rev_list builtin/rev-list.c:588:9
        #9 0x5555555ec5e3 in run_builtin git.c:483:11
        #10 0x5555555eb1e4 in handle_builtin git.c:749:13
        #11 0x5555555ec001 in run_argv git.c:819:4
        #12 0x5555555eaf94 in cmd_main git.c:954:19
        #13 0x5555556fd569 in main common-main.c:64:11
        #14 0x7ffff7ca714d in __libc_start_call_main (.../lib/libc.so.6+0x2a14d)
        #15 0x7ffff7ca7208 in __libc_start_main@GLIBC_2.2.5 (.../libc.so.6+0x2a208)
        #16 0x5555555ad064 in _start (git+0x59064)

This leak is valid, as we call `die()` and do not clean up the memory at
all. But what's curious is that this is the only leak reported, because
we don't clean up any other allocated memory, either, and I have no idea
why the leak sanitizer treats this buffer specially.

In any case, we can work around the leak by shuffling things around a
bit. Instead of calling `gently_parse_list_objects_filter()` and dying
after we have modified the filter spec, we simply do so beforehand. Like
this we don't allocate the buffer in the error case, which makes the
reported leak go away.

It's not pretty, but it manages to make t6112 leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitgitgadget bot pushed a commit that referenced this pull request Nov 25, 2024
gitgitgadget bot pushed a commit that referenced this pull request Dec 27, 2024
gitgitgadget bot pushed a commit that referenced this pull request Dec 30, 2024
In 1b9e9be (csum-file.c: use unsafe SHA-1 implementation when
available, 2024-09-26) we have converted our `struct hashfile` to use
the unsafe SHA1 backend, which results in a significant speedup. One
needs to be careful with how to use that structure now though because
callers need to consistently use either the safe or unsafe variants of
SHA1, as otherwise one can easily trigger corruption.

As it turns out, we have one inconsistent usage in our tree because we
directly initialize `struct hashfile_checkpoint::ctx` with the safe
variant of SHA1, but end up writing to that context with the unsafe
ones. This went unnoticed so far because our CI systems do not exercise
different hash functions for these two backends, and consequently safe
and unsafe variants are equivalent. But when using SHA1DC as safe and
OpenSSL as unsafe backend this leads to a crash an t1050:

    ++ git -c core.compression=0 add large1
    AddressSanitizer:DEADLYSIGNAL
    =================================================================
    ==1367==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 (pc 0x7ffff7a01a99 bp 0x507000000db0 sp 0x7fffffff5690 T0)
    ==1367==The signal is caused by a READ memory access.
    ==1367==Hint: address points to the zero page.
        #0 0x7ffff7a01a99 in EVP_MD_CTX_copy_ex (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4)
        #1 0x555555ddde56 in openssl_SHA1_Clone ../sha1/openssl.h:40:2
        #2 0x555555dce2fc in git_hash_sha1_clone_unsafe ../object-file.c:123:2
        #3 0x555555c2d5f8 in hashfile_checkpoint ../csum-file.c:211:2
        #4 0x555555b9905d in deflate_blob_to_pack ../bulk-checkin.c:286:4
        #5 0x555555b98ae9 in index_blob_bulk_checkin ../bulk-checkin.c:362:15
        #6 0x555555ddab62 in index_blob_stream ../object-file.c:2756:9
        #7 0x555555dda420 in index_fd ../object-file.c:2778:9
        #8 0x555555ddad76 in index_path ../object-file.c:2796:7
        #9 0x555555e947f3 in add_to_index ../read-cache.c:771:7
        #10 0x555555e954a4 in add_file_to_index ../read-cache.c:804:9
        #11 0x5555558b5c39 in add_files ../builtin/add.c:355:7
        #12 0x5555558b412e in cmd_add ../builtin/add.c:578:18
        #13 0x555555b1f493 in run_builtin ../git.c:480:11
        #14 0x555555b1bfef in handle_builtin ../git.c:740:9
        #15 0x555555b1e6f4 in run_argv ../git.c:807:4
        #16 0x555555b1b87a in cmd_main ../git.c:947:19
        #17 0x5555561649e6 in main ../common-main.c:64:11
        #18 0x7ffff742a1fb in __libc_start_call_main (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4)
        #19 0x7ffff742a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4)
        #20 0x555555772c84 in _start (git+0x21ec84)

    ==1367==Register values:
    rax = 0x0000511000001080  rbx = 0x0000000000000000  rcx = 0x000000000000000c  rdx = 0x0000000000000000
    rdi = 0x0000000000000000  rsi = 0x0000507000000db0  rbp = 0x0000507000000db0  rsp = 0x00007fffffff5690
     r8 = 0x0000000000000000   r9 = 0x0000000000000000  r10 = 0x0000000000000000  r11 = 0x00007ffff7a01a30
    r12 = 0x0000000000000000  r13 = 0x00007fffffff6b38  r14 = 0x00007ffff7ffd000  r15 = 0x00005555563b9910
    AddressSanitizer can not provide additional info.
    SUMMARY: AddressSanitizer: SEGV (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) in EVP_MD_CTX_copy_ex
    ==1367==ABORTING
    ./test-lib.sh: line 1023:  1367 Aborted                 git $config add large1
    error: last command exited with $?=134
    not ok 4 - add with -c core.compression=0

Fix the issue by using the unsafe variant instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitgitgadget bot pushed a commit that referenced this pull request Dec 30, 2024
Same as with the preceding commit, git-fast-import(1) is using the safe
variant to initialize a hashfile checkpoint. This leads to a segfault
when passing the checkpoint into the hashfile subsystem because it would
use the unsafe variants instead:

    ++ git --git-dir=R/.git fast-import --big-file-threshold=1
    AddressSanitizer:DEADLYSIGNAL
    =================================================================
    ==577126==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 (pc 0x7ffff7a01a99 bp 0x5070000009c0 sp 0x7fffffff5b30 T0)
    ==577126==The signal is caused by a READ memory access.
    ==577126==Hint: address points to the zero page.
        #0 0x7ffff7a01a99 in EVP_MD_CTX_copy_ex (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4)
        #1 0x555555ddde56 in openssl_SHA1_Clone ../sha1/openssl.h:40:2
        #2 0x555555dce2fc in git_hash_sha1_clone_unsafe ../object-file.c:123:2
        #3 0x555555c2d5f8 in hashfile_checkpoint ../csum-file.c:211:2
        #4 0x5555559647d1 in stream_blob ../builtin/fast-import.c:1110:2
        #5 0x55555596247b in parse_and_store_blob ../builtin/fast-import.c:2031:3
        #6 0x555555967f91 in file_change_m ../builtin/fast-import.c:2408:5
        #7 0x55555595d8a2 in parse_new_commit ../builtin/fast-import.c:2768:4
        #8 0x55555595bb7a in cmd_fast_import ../builtin/fast-import.c:3614:4
        #9 0x555555b1f493 in run_builtin ../git.c:480:11
        #10 0x555555b1bfef in handle_builtin ../git.c:740:9
        #11 0x555555b1e6f4 in run_argv ../git.c:807:4
        #12 0x555555b1b87a in cmd_main ../git.c:947:19
        #13 0x5555561649e6 in main ../common-main.c:64:11
        #14 0x7ffff742a1fb in __libc_start_call_main (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4)
        #15 0x7ffff742a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4)
        #16 0x555555772c84 in _start (git+0x21ec84)

    ==577126==Register values:
    rax = 0x0000511000000cc0  rbx = 0x0000000000000000  rcx = 0x000000000000000c  rdx = 0x0000000000000000
    rdi = 0x0000000000000000  rsi = 0x00005070000009c0  rbp = 0x00005070000009c0  rsp = 0x00007fffffff5b30
     r8 = 0x0000000000000000   r9 = 0x0000000000000000  r10 = 0x0000000000000000  r11 = 0x00007ffff7a01a30
    r12 = 0x0000000000000000  r13 = 0x00007fffffff6b60  r14 = 0x00007ffff7ffd000  r15 = 0x00005555563b9910
    AddressSanitizer can not provide additional info.
    SUMMARY: AddressSanitizer: SEGV (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) in EVP_MD_CTX_copy_ex
    ==577126==ABORTING
    ./test-lib.sh: line 1039: 577126 Aborted                 git --git-dir=R/.git fast-import --big-file-threshold=1 < input
    error: last command exited with $?=134
    not ok 167 - R: blob bigger than threshold

The segfault is only exposed in case the unsafe and safe backends are
different from one another.

Fix the issue by initializing the context with the unsafe SHA1 variant.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitgitgadget bot pushed a commit that referenced this pull request Jan 29, 2025
When trying to create a Unix socket in a path that exceeds the maximum
socket name length we try to first change the directory into the parent
folder before creating the socket to reduce the length of the name. When
this fails we error out of `unix_sockaddr_init()` with an error code,
which indicates to the caller that the context has not been initialized.
Consequently, they don't release that context.

This leads to a memory leak: when we have already populated the context
with the original directory that we need to chdir(3p) back into, but
then the chdir(3p) into the socket's parent directory fails, then we
won't release the original directory's path. The leak is exposed by
t0301, but only via Meson with `meson setup -Dsanitize=leak`:

    Direct leak of 129 byte(s) in 1 object(s) allocated from:
        #0 0x5555555e85c6 in realloc.part.0 lsan_interceptors.cpp.o
        #1 0x55555590e3d6 in xrealloc ../wrapper.c:140:8
        #2 0x5555558c8fc6 in strbuf_grow ../strbuf.c:114:2
        #3 0x5555558cacab in strbuf_getcwd ../strbuf.c:605:3
        #4 0x555555923ff6 in unix_sockaddr_init ../unix-socket.c:65:7
        #5 0x555555923e42 in unix_stream_connect ../unix-socket.c:84:6
        #6 0x55555562a984 in send_request ../builtin/credential-cache.c:46:11
        #7 0x55555562a89e in do_cache ../builtin/credential-cache.c:108:6
        #8 0x55555562a655 in cmd_credential_cache ../builtin/credential-cache.c:178:3
        #9 0x555555700547 in run_builtin ../git.c:480:11
        #10 0x5555556ff0e0 in handle_builtin ../git.c:740:9
        #11 0x5555556ffee8 in run_argv ../git.c:807:4
        #12 0x5555556fee6b in cmd_main ../git.c:947:19
        #13 0x55555593f689 in main ../common-main.c:64:11
        #14 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #15 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #16 0x5555555ad1d4 in _start (git+0x591d4)

    DEDUP_TOKEN: ___interceptor_realloc.part.0--xrealloc--strbuf_grow--strbuf_getcwd--unix_sockaddr_init--unix_stream_connect--send_request--do_cache--cmd_credential_cache--run_builtin--handle_builtin--run_argv--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start
    SUMMARY: LeakSanitizer: 129 byte(s) leaked in 1 allocation(s).

Fix this leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitgitgadget bot pushed a commit that referenced this pull request Jan 29, 2025
We don't free the result of `remote_default_branch()`, leading to a
memory leak. This leak is exposed by t9211, but only when run with Meson
via `meson setup -Dsanitize=leak`:

    Direct leak of 5 byte(s) in 1 object(s) allocated from:
        #0 0x5555555cfb93 in malloc (scalar+0x7bb93)
        #1 0x5555556b05c2 in do_xmalloc ../wrapper.c:55:8
        #2 0x5555556b06c4 in do_xmallocz ../wrapper.c:89:8
        #3 0x5555556b0656 in xmallocz ../wrapper.c:97:9
        #4 0x5555556b0728 in xmemdupz ../wrapper.c:113:16
        #5 0x5555556b07a7 in xstrndup ../wrapper.c:119:9
        #6 0x5555555d3a4b in remote_default_branch ../scalar.c:338:14
        #7 0x5555555d20e6 in cmd_clone ../scalar.c:493:28
        #8 0x5555555d196b in cmd_main ../scalar.c:992:14
        #9 0x5555557c4059 in main ../common-main.c:64:11
        #10 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #11 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #12 0x555555592054 in _start (scalar+0x3e054)

    DEDUP_TOKEN: __interceptor_malloc--do_xmalloc--do_xmallocz--xmallocz--xmemdupz--xstrndup--remote_default_branch--cmd_clone--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start
    SUMMARY: LeakSanitizer: 5 byte(s) leaked in 1 allocation(s).

As the `branch` variable may contain a string constant obtained from
parsing command line arguments we cannot free the leaking variable
directly. Instead, introduce a new `branch_to_free` variable that only
ever gets assigned the allocated string and free that one to plug the
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitgitgadget bot pushed a commit that referenced this pull request Jan 30, 2025
When trying to create a Unix socket in a path that exceeds the maximum
socket name length we try to first change the directory into the parent
folder before creating the socket to reduce the length of the name. When
this fails we error out of `unix_sockaddr_init()` with an error code,
which indicates to the caller that the context has not been initialized.
Consequently, they don't release that context.

This leads to a memory leak: when we have already populated the context
with the original directory that we need to chdir(3p) back into, but
then the chdir(3p) into the socket's parent directory fails, then we
won't release the original directory's path. The leak is exposed by
t0301, but only when running tests in a directory hierarchy whose path
is long enough to make the socket name length exceed the maximum socket
name length:

    Direct leak of 129 byte(s) in 1 object(s) allocated from:
        #0 0x5555555e85c6 in realloc.part.0 lsan_interceptors.cpp.o
        #1 0x55555590e3d6 in xrealloc ../wrapper.c:140:8
        #2 0x5555558c8fc6 in strbuf_grow ../strbuf.c:114:2
        #3 0x5555558cacab in strbuf_getcwd ../strbuf.c:605:3
        #4 0x555555923ff6 in unix_sockaddr_init ../unix-socket.c:65:7
        #5 0x555555923e42 in unix_stream_connect ../unix-socket.c:84:6
        #6 0x55555562a984 in send_request ../builtin/credential-cache.c:46:11
        #7 0x55555562a89e in do_cache ../builtin/credential-cache.c:108:6
        #8 0x55555562a655 in cmd_credential_cache ../builtin/credential-cache.c:178:3
        #9 0x555555700547 in run_builtin ../git.c:480:11
        #10 0x5555556ff0e0 in handle_builtin ../git.c:740:9
        #11 0x5555556ffee8 in run_argv ../git.c:807:4
        #12 0x5555556fee6b in cmd_main ../git.c:947:19
        #13 0x55555593f689 in main ../common-main.c:64:11
        #14 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #15 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #16 0x5555555ad1d4 in _start (git+0x591d4)

    DEDUP_TOKEN: ___interceptor_realloc.part.0--xrealloc--strbuf_grow--strbuf_getcwd--unix_sockaddr_init--unix_stream_connect--send_request--do_cache--cmd_credential_cache--run_builtin--handle_builtin--run_argv--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start
    SUMMARY: LeakSanitizer: 129 byte(s) leaked in 1 allocation(s).

Fix this leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitgitgadget bot pushed a commit that referenced this pull request Jan 30, 2025
We don't free the result of `remote_default_branch()`, leading to a
memory leak. This leak is exposed by t9211, but only when run with Meson
with the `-Db_sanitize=leak` option:

    Direct leak of 5 byte(s) in 1 object(s) allocated from:
        #0 0x5555555cfb93 in malloc (scalar+0x7bb93)
        #1 0x5555556b05c2 in do_xmalloc ../wrapper.c:55:8
        #2 0x5555556b06c4 in do_xmallocz ../wrapper.c:89:8
        #3 0x5555556b0656 in xmallocz ../wrapper.c:97:9
        #4 0x5555556b0728 in xmemdupz ../wrapper.c:113:16
        #5 0x5555556b07a7 in xstrndup ../wrapper.c:119:9
        #6 0x5555555d3a4b in remote_default_branch ../scalar.c:338:14
        #7 0x5555555d20e6 in cmd_clone ../scalar.c:493:28
        #8 0x5555555d196b in cmd_main ../scalar.c:992:14
        #9 0x5555557c4059 in main ../common-main.c:64:11
        #10 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #11 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #12 0x555555592054 in _start (scalar+0x3e054)

    DEDUP_TOKEN: __interceptor_malloc--do_xmalloc--do_xmallocz--xmallocz--xmemdupz--xstrndup--remote_default_branch--cmd_clone--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start
    SUMMARY: LeakSanitizer: 5 byte(s) leaked in 1 allocation(s).

As the `branch` variable may contain a string constant obtained from
parsing command line arguments we cannot free the leaking variable
directly. Instead, introduce a new `branch_to_free` variable that only
ever gets assigned the allocated string and free that one to plug the
leak.

It is unclear why the leak isn't flagged when running the test via our
Makefile.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitgitgadget bot pushed a commit that referenced this pull request Jan 31, 2025
gitgitgadget bot pushed a commit that referenced this pull request Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant