Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement batch mode on the core.fsyncmethod infrastructure #1

Closed
wants to merge 0 commits into from

Conversation

neerajsi-msft
Copy link
Owner

Detailed description TBD.

@neerajsi-msft neerajsi-msft force-pushed the ns/core-fsync branch 3 times, most recently from afaaf12 to 2207950 Compare December 8, 2021 00:15
@neerajsi-msft neerajsi-msft force-pushed the ns/core-fsync branch 2 times, most recently from 7f2e1ce to 9d21e3f Compare January 18, 2022 21:55
neerajsi-msft pushed a commit that referenced this pull request Mar 9, 2022
Add "fast_unwind_on_malloc=0" to LSAN_OPTIONS to get more meaningful
stack traces from LSAN. This isn't required under ASAN which will emit
traces such as this one for a leak in "t/t0006-date.sh":

    $ ASAN_OPTIONS=detect_leaks=1 ./t0006-date.sh -vixd
    [...]
    Direct leak of 3 byte(s) in 1 object(s) allocated from:
        #0 0x488b94 in strdup (t/helper/test-tool+0x488b94)
        #1 0x9444a4 in xstrdup wrapper.c:29:14
        microsoft#2 0x5995fa in parse_date_format date.c:991:24
        microsoft#3 0x4d2056 in show_dates t/helper/test-date.c:39:2
        microsoft#4 0x4d174a in cmd__date t/helper/test-date.c:116:3
        microsoft#5 0x4cce89 in cmd_main t/helper/test-tool.c:127:11
        microsoft#6 0x4cd1e3 in main common-main.c:52:11
        microsoft#7 0x7fef3c695e49 in __libc_start_main csu/../csu/libc-start.c:314:16
        microsoft#8 0x422b09 in _start (t/helper/test-tool+0x422b09)

    SUMMARY: AddressSanitizer: 3 byte(s) leaked in 1 allocation(s).
    Aborted

Whereas LSAN would emit this instead:

    $ ./t0006-date.sh -vixd
    [...]
    Direct leak of 3 byte(s) in 1 object(s) allocated from:
        #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8)
        #1 0x7f2be1d614aa in strdup string/strdup.c:42:15

    SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s).
    Aborted

Now we'll instead git this sensible stack trace under
LSAN. I.e. almost the same one (but starting with "malloc", as is
usual for LSAN) as under ASAN:

    Direct leak of 3 byte(s) in 1 object(s) allocated from:
        #0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8)
        #1 0x7f012af5c4aa in strdup string/strdup.c:42:15
        microsoft#2 0x5cb164 in xstrdup wrapper.c:29:14
        microsoft#3 0x495ee9 in parse_date_format date.c:991:24
        microsoft#4 0x453aac in show_dates t/helper/test-date.c:39:2
        microsoft#5 0x453782 in cmd__date t/helper/test-date.c:116:3
        microsoft#6 0x451d95 in cmd_main t/helper/test-tool.c:127:11
        microsoft#7 0x451f1e in main common-main.c:52:11
        microsoft#8 0x7f012aef5e49 in __libc_start_main csu/../csu/libc-start.c:314:16
        microsoft#9 0x42e0a9 in _start (t/helper/test-tool+0x42e0a9)

    SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s).
    Aborted

As the option name suggests this does make things slower, e.g. for
t0001-init.sh we're around 10% slower:

    $ hyperfine -L v 0,1 'LSAN_OPTIONS=fast_unwind_on_malloc={v} make T=t0001-init.sh' -r 3
    Benchmark 1: LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh
      Time (mean ± σ):      2.135 s ±  0.015 s    [User: 1.951 s, System: 0.554 s]
      Range (min … max):    2.122 s …  2.152 s    3 runs

    Benchmark 2: LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh
      Time (mean ± σ):      1.981 s ±  0.055 s    [User: 1.769 s, System: 0.488 s]
      Range (min … max):    1.941 s …  2.044 s    3 runs

    Summary
      'LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh' ran
        1.08 ± 0.03 times faster than 'LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh'

I think that's more than worth it to get the more meaningful stack
traces, we can always provide LSAN_OPTIONS=fast_unwind_on_malloc=0 for
one-off "fast" runs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
neerajsi-msft pushed a commit that referenced this pull request Mar 20, 2022
Add a failing test which demonstrates a regression in
a18d66c ("diff.c: free "buf" in diff_words_flush()", 2022-03-04),
the regression is discussed in detail in the subsequent commit. With
it running `git show --word-diff --color-moved` with SANITIZE=address
would emit:

	==31191==ERROR: AddressSanitizer: attempting double-free on 0x617000021100 in thread T0:
	    #0 0x49f0a2 in free (git+0x49f0a2)
	    #1 0x9b0e4d in diff_words_flush diff.c:2153:3
	    microsoft#2 0x9aed5d in fn_out_consume diff.c:2354:3
	    microsoft#3 0xe092ab in consume_one xdiff-interface.c:43:9
	    microsoft#4 0xe072eb in xdiff_outf xdiff-interface.c:76:10
	    microsoft#5 0xec7014 in xdl_emit_diffrec xdiff/xutils.c:53:6
	    [...]

	0x617000021100 is located 0 bytes inside of 768-byte region [0x617000021100,0x617000021400)
	freed by thread T0 here:
	    #0 0x49f0a2 in free (git+0x49f0a2)
	    [...(same stacktrace)...]

	previously allocated by thread T0 here:
	    #0 0x49f603 in __interceptor_realloc (git+0x49f603)
	    #1 0xde4da4 in xrealloc wrapper.c:126:8
	    microsoft#2 0x995dc5 in append_emitted_diff_symbol diff.c:794:2
	    microsoft#3 0x96c44a in emit_diff_symbol diff.c:1527:3
	    [...]

This was not caught by the test suite because we test `diff
--word-diff --color-moved` only so far.

Therefore, add a test for `show`, too.

Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
neerajsi-msft pushed a commit that referenced this pull request Mar 29, 2022
In the preceding [1] (pack-objects: move revs out of
get_object_list(), 2022-03-22) the "repo_init_revisions()" was moved
to cmd_pack_objects() so that it unconditionally took place for all
invocations of "git pack-objects".

We'd thus start leaking memory, which is easily reproduced in
e.g. git.git by feeding e83c516 (Initial revision of "git", the
information manager from hell, 2005-04-07) to "git pack-objects";

    $ echo e83c516 | ./git pack-objects initial
    [...]
	==19130==ERROR: LeakSanitizer: detected memory leaks

	Direct leak of 7120 byte(s) in 1 object(s) allocated from:
	    #0 0x455308 in __interceptor_malloc (/home/avar/g/git/git+0x455308)
	    #1 0x75b399 in do_xmalloc /home/avar/g/git/wrapper.c:41:8
	    microsoft#2 0x75b356 in xmalloc /home/avar/g/git/wrapper.c:62:9
	    microsoft#3 0x5d7609 in prep_parse_options /home/avar/g/git/diff.c:5647:2
	    microsoft#4 0x5d415a in repo_diff_setup /home/avar/g/git/diff.c:4621:2
	    microsoft#5 0x6dffbb in repo_init_revisions /home/avar/g/git/revision.c:1853:2
	    microsoft#6 0x4f599d in cmd_pack_objects /home/avar/g/git/builtin/pack-objects.c:3980:2
	    microsoft#7 0x4592ca in run_builtin /home/avar/g/git/git.c:465:11
	    microsoft#8 0x457d81 in handle_builtin /home/avar/g/git/git.c:718:3
	    microsoft#9 0x458ca5 in run_argv /home/avar/g/git/git.c:785:4
	    microsoft#10 0x457b40 in cmd_main /home/avar/g/git/git.c:916:19
	    microsoft#11 0x562259 in main /home/avar/g/git/common-main.c:56:11
	    microsoft#12 0x7fce792ac7ec in __libc_start_main csu/../csu/libc-start.c:332:16
	    microsoft#13 0x4300f9 in _start (/home/avar/g/git/git+0x4300f9)

	SUMMARY: LeakSanitizer: 7120 byte(s) leaked in 1 allocation(s).
	Aborted

Narrowly fixing that commit would have been easy, just add call
repo_init_revisions() right before get_object_list(), which is
effectively what was done before that commit.

But an unstated constraint when setting it up early is that it was
needed for the subsequent [2] (pack-objects: parse --filter directly
into revs.filter, 2022-03-22), i.e. we might have a --filter
command-line option, and need to either have the "struct rev_info"
setup when we encounter that option, or later.

Let's just change the control flow so that we'll instead set up the
"struct rev_info" only when we need it. Doing so leads to a bit more
verbosity, but it's a lot clearer what we're doing and why.

An earlier version of this commit[3] went behind
opt_parse_list_objects_filter()'s back by faking up a "struct option"
before calling it. Let's avoid that and instead create a blessed API
for this pattern.

We could furthermore combine the two get_object_list() invocations
here by having repo_init_revisions() invoked on &pfd.revs, but I think
clearly separating the two makes the flow clearer. Likewise
redundantly but explicitly (i.e. redundant v.s. a "{ 0 }") "0" to
"have_revs" early in cmd_pack_objects().

While we're at it add parentheses around the arguments to the OPT_*
macros in in list-objects-filter-options.h, as we need to change those
lines anyway. It doesn't matter in this case, but is good general
practice.

1. https://lore.kernel.org/git/619b757d98465dbc4995bdc11a5282fbfcbd3daa.1647970119.git.gitgitgadget@gmail.com
2. https://lore.kernel.org/git/97de926904988b89b5663bd4c59c011a1723a8f5.1647970119.git.gitgitgadget@gmail.com
3. https://lore.kernel.org/git/patch-1.1-193534b0f07-20220325T121715Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant