Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Balc Stubs in gold #1

Closed

Conversation

djtodoro
Copy link
Collaborator

@djtodoro djtodoro commented Aug 9, 2023

The patch on the LLVM side: MediaTek-Labs/llvm-project#25

@djtodoro
Copy link
Collaborator Author

djtodoro commented Aug 9, 2023

Enabled by default, to disable use
  --no-relax-balc-trampolines
@djtodoro djtodoro force-pushed the mips/umips/gold_v7 branch from 770c6a5 to 2ef4ca2 Compare August 9, 2023 10:18
@djtodoro djtodoro changed the base branch from mips/umips/gold_v7 to mtk/gold_v7 August 10, 2023 08:36
@djtodoro djtodoro closed this Aug 10, 2023
farazs-github pushed a commit that referenced this pull request Apr 11, 2024
When -fsanitize=address,undefined is used to build, the mmap configure
check failed with

=================================================================
==231796==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 4096 byte(s) in 1 object(s) allocated from:
    #0 0x7cdd3d0defdf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x5750c7f6d72b in main /home/alan/build/gas-san/all/bfd/conftest.c:239

Direct leak of 4096 byte(s) in 1 object(s) allocated from:
    #0 0x7cdd3d0defdf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x5750c7f6d2e1 in main /home/alan/build/gas-san/all/bfd/conftest.c:190

SUMMARY: AddressSanitizer: 8192 byte(s) leaked in 2 allocation(s).

Define GCC_AC_FUNC_MMAP with export ASAN_OPTIONS=detect_leaks=0 to avoid
the sanitizer configure check failure.

config/

	* mmap.m4 (GCC_AC_FUNC_MMAP): New.
	* no-executables.m4 (AC_FUNC_MMAP): Renamed to GCC_AC_FUNC_MMAP.
	Change AC_FUNC_MMAP to GCC_AC_FUNC_MMAP.

libiberty/

	* Makefile.in (aclocal_deps): Add $(srcdir)/../config/mmap.m4.
	* acinclude.m4: Change AC_FUNC_MMAP to GCC_AC_FUNC_MMAP.
	* aclocal.m4: Regenerated.
	* configure: Likewise.

zlib/

	* acinclude.m4: Include ../config/mmap.m4.
	* Makefile.in: Regenerated.
	* configure: Likewise.
farazs-github pushed a commit that referenced this pull request Apr 11, 2024
When -fsanitize=address,undefined is used to build, the mmap configure
check failed with

=================================================================
==231796==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 4096 byte(s) in 1 object(s) allocated from:
    #0 0x7cdd3d0defdf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x5750c7f6d72b in main /home/alan/build/gas-san/all/bfd/conftest.c:239

Direct leak of 4096 byte(s) in 1 object(s) allocated from:
    #0 0x7cdd3d0defdf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x5750c7f6d2e1 in main /home/alan/build/gas-san/all/bfd/conftest.c:190

SUMMARY: AddressSanitizer: 8192 byte(s) leaked in 2 allocation(s).

Replace AC_FUNC_MMAP with GCC_AC_FUNC_MMAP to avoid the sanitizer
configure check failure.

bfd/

	* configure.ac: Replace AC_FUNC_MMAP with GCC_AC_FUNC_MMAP.
	* Makefile.in: Regenerated.
	* aclocal.m4: Likewise.
	* configure: Likewise.

binutils/

	* configure.ac: Replace AC_FUNC_MMAP with GCC_AC_FUNC_MMAP.
	* Makefile.in: Regenerated.
	* aclocal.m4: Likewise.
	* configure: Likewise.

ld/

	* configure.ac: Replace AC_FUNC_MMAP with GCC_AC_FUNC_MMAP.
	* Makefile.in: Regenerated.
	* aclocal.m4: Likewise.
	* configure: Likewise.

libctf/

	* configure.ac: Replace AC_FUNC_MMAP with GCC_AC_FUNC_MMAP.
	* Makefile.in: Regenerated.
	* aclocal.m4: Likewise.
	* configure: Likewise.

libsframe/

	* configure.ac: Replace AC_FUNC_MMAP with GCC_AC_FUNC_MMAP.
	* Makefile.in: Regenerated.
	* aclocal.m4: Likewise.
	* configure: Likewise.
farazs-github pushed a commit that referenced this pull request Apr 25, 2024
After installing glibc debuginfo, I ran into:
...
FAIL: gdb.threads/threadcrash.exp: test_live_inferior: \
  $thread_count == [llength $test_list]
...

This happens because the clause:
...
	-re "^\r\n${hs}main$hs$eol" {
...
which is intended to match only:
...
 #1  <hex> in main () at threadcrash.c:423^M
...
also matches "remaining" in:
...
 #1  <hex> in __GI___nanosleep (requested_time=<hex>, remaining=<hex>) at \
   nanosleep.c:27^M
...

Fix this by checking for "in main" instead.

Tested on x86_64-linux.
farazs-github pushed a commit that referenced this pull request Apr 28, 2024
When running test-case gdb.server/connect-with-no-symbol-file.exp on
aarch64-linux (specifically, an opensuse leap 15.5 container on a
fedora asahi 39 system), I run into:
...
(gdb) detach^M
Detaching from program: target:connect-with-no-symbol-file, process 185104^M
Ending remote debugging.^M
terminate called after throwing an instance of 'gdb_exception_error'^M
...

The detailed backtrace of the corefile is:
...
 (gdb) bt
 #0  0x0000ffff75504f54 in raise () from /lib64/libpthread.so.0
 #1  0x00000000007a86b4 in handle_fatal_signal (sig=6)
     at gdb/event-top.c:926
 #2  <signal handler called>
 #3  0x0000ffff74b977b4 in raise () from /lib64/libc.so.6
 #4  0x0000ffff74b98c18 in abort () from /lib64/libc.so.6
 #5  0x0000ffff74ea26f4 in __gnu_cxx::__verbose_terminate_handler() ()
    from /usr/lib64/libstdc++.so.6
 #6  0x0000ffff74ea011c in ?? () from /usr/lib64/libstdc++.so.6
 #7  0x0000ffff74ea0180 in std::terminate() () from /usr/lib64/libstdc++.so.6
 #8  0x0000ffff74ea0464 in __cxa_throw () from /usr/lib64/libstdc++.so.6
 bminor#9  0x0000000001548870 in throw_it (reason=RETURN_ERROR,
     error=TARGET_CLOSE_ERROR, fmt=0x16c7810 "Remote connection closed", ap=...)
     at gdbsupport/common-exceptions.cc:203
 bminor#10 0x0000000001548920 in throw_verror (error=TARGET_CLOSE_ERROR,
     fmt=0x16c7810 "Remote connection closed", ap=...)
     at gdbsupport/common-exceptions.cc:211
 bminor#11 0x0000000001548a00 in throw_error (error=TARGET_CLOSE_ERROR,
     fmt=0x16c7810 "Remote connection closed")
     at gdbsupport/common-exceptions.cc:226
 bminor#12 0x0000000000ac8f2c in remote_target::readchar (this=0x233d3d90, timeout=2)
     at gdb/remote.c:9856
 bminor#13 0x0000000000ac9f04 in remote_target::getpkt (this=0x233d3d90,
     buf=0x233d40a8, forever=false, is_notif=0x0) at gdb/remote.c:10326
 bminor#14 0x0000000000acf3d0 in remote_target::remote_hostio_send_command
     (this=0x233d3d90, command_bytes=13, which_packet=17,
     remote_errno=0xfffff1a3cf38, attachment=0xfffff1a3ce88,
     attachment_len=0xfffff1a3ce90) at gdb/remote.c:12567
 #15 0x0000000000ad03bc in remote_target::fileio_fstat (this=0x233d3d90, fd=3,
     st=0xfffff1a3d020, remote_errno=0xfffff1a3cf38)
     at gdb/remote.c:12979
 #16 0x0000000000c39878 in target_fileio_fstat (fd=0, sb=0xfffff1a3d020,
     target_errno=0xfffff1a3cf38) at gdb/target.c:3315
 #17 0x00000000007eee5c in target_fileio_stream::stat (this=0x233d4400,
     abfd=0x2323fc40, sb=0xfffff1a3d020) at gdb/gdb_bfd.c:467
 #18 0x00000000007f012c in <lambda(bfd*, void*, stat*)>::operator()(bfd *,
     void *, stat *) const (__closure=0x0, abfd=0x2323fc40, stream=0x233d4400,
     sb=0xfffff1a3d020) at gdb/gdb_bfd.c:955
 #19 0x00000000007f015c in <lambda(bfd*, void*, stat*)>::_FUN(bfd *, void *,
     stat *) () at gdb/gdb_bfd.c:956
 #20 0x0000000000f9b838 in opncls_bstat (abfd=0x2323fc40, sb=0xfffff1a3d020)
     at bfd/opncls.c:665
 #21 0x0000000000f90adc in bfd_stat (abfd=0x2323fc40, statbuf=0xfffff1a3d020)
     at bfd/bfdio.c:431
 #22 0x000000000065fe20 in reopen_exec_file () at gdb/corefile.c:52
 #23 0x0000000000c3a3e8 in generic_mourn_inferior ()
     at gdb/target.c:3642
 #24 0x0000000000abf3f0 in remote_unpush_target (target=0x233d3d90)
     at gdb/remote.c:6067
 #25 0x0000000000aca8b0 in remote_target::mourn_inferior (this=0x233d3d90)
     at gdb/remote.c:10587
 #26 0x0000000000c387cc in target_mourn_inferior (
     ptid=<error reading variable: Cannot access memory at address 0x2d310>)
     at gdb/target.c:2738
 #27 0x0000000000abfff0 in remote_target::remote_detach_1 (this=0x233d3d90,
     inf=0x22fce540, from_tty=1) at gdb/remote.c:6421
 #28 0x0000000000ac0094 in remote_target::detach (this=0x233d3d90,
     inf=0x22fce540, from_tty=1) at gdb/remote.c:6436
 #29 0x0000000000c37c3c in target_detach (inf=0x22fce540, from_tty=1)
     at gdb/target.c:2526
 #30 0x0000000000860424 in detach_command (args=0x0, from_tty=1)
    at gdb/infcmd.c:2817
 #31 0x000000000060b594 in do_simple_func (args=0x0, from_tty=1, c=0x231431a0)
     at gdb/cli/cli-decode.c:94
 #32 0x00000000006108c8 in cmd_func (cmd=0x231431a0, args=0x0, from_tty=1)
     at gdb/cli/cli-decode.c:2741
 #33 0x0000000000c65a94 in execute_command (p=0x232e52f6 "", from_tty=1)
     at gdb/top.c:570
 #34 0x00000000007a7d2c in command_handler (command=0x232e52f0 "")
     at gdb/event-top.c:566
 #35 0x00000000007a8290 in command_line_handler (rl=...)
     at gdb/event-top.c:802
 #36 0x0000000000c9092c in tui_command_line_handler (rl=...)
     at gdb/tui/tui-interp.c:103
 #37 0x00000000007a750c in gdb_rl_callback_handler (rl=0x23385330 "detach")
     at gdb/event-top.c:258
 #38 0x0000000000d910f4 in rl_callback_read_char ()
     at readline/readline/callback.c:290
 #39 0x00000000007a7338 in gdb_rl_callback_read_char_wrapper_noexcept ()
     at gdb/event-top.c:194
 #40 0x00000000007a73f0 in gdb_rl_callback_read_char_wrapper
     (client_data=0x22fbf640) at gdb/event-top.c:233
 #41 0x0000000000cbee1c in stdin_event_handler (error=0, client_data=0x22fbf640)
     at gdb/ui.c:154
 #42 0x000000000154ed60 in handle_file_event (file_ptr=0x232be730, ready_mask=1)
     at gdbsupport/event-loop.cc:572
 #43 0x000000000154f21c in gdb_wait_for_event (block=1)
     at gdbsupport/event-loop.cc:693
 #44 0x000000000154dec4 in gdb_do_one_event (mstimeout=-1)
    at gdbsupport/event-loop.cc:263
 #45 0x0000000000910f98 in start_event_loop () at gdb/main.c:400
 #46 0x0000000000911130 in captured_command_loop () at gdb/main.c:464
 #47 0x0000000000912b5c in captured_main (data=0xfffff1a3db58)
     at gdb/main.c:1338
 #48 0x0000000000912bf4 in gdb_main (args=0xfffff1a3db58)
     at gdb/main.c:1357
 #49 0x00000000004170f4 in main (argc=10, argv=0xfffff1a3dcc8)
     at gdb/gdb.c:38
 (gdb)
...

The abort happens because a c++ exception escapes to c code, specifically
opncls_bstat in bfd/opncls.c.  Compiling with -fexceptions works around this.

Fix this by catching the exception just before it escapes, in stat_trampoline
and likewise in few similar spot.

Add a new template catch_exceptions to do so in a consistent way.

Tested on aarch64-linux.

Approved-by: Pedro Alves <pedro@palves.net>

PR remote/31577
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31577
farazs-github pushed a commit that referenced this pull request May 5, 2024
If threads are disabled, either by --disable-threading explicitely, or by
missing std::thread support, you get the following ASAN error when
loading symbols:

==7310==ERROR: AddressSanitizer: heap-use-after-free on address 0x614000002128 at pc 0x00000098794a bp 0x7ffe37e6af70 sp 0x7ffe37e6af68
READ of size 1 at 0x614000002128 thread T0
    #0 0x987949 in index_cache_store_context::store() const ../../gdb/dwarf2/index-cache.c:163
    #1 0x943467 in cooked_index_worker::write_to_cache(cooked_index const*, deferred_warnings*) const ../../gdb/dwarf2/cooked-index.c:601
    #2 0x1705e39 in std::function<void ()>::operator()() const /gcc/9/include/c++/9.2.0/bits/std_function.h:690
    #3 0x1705e39 in gdb::task_group::impl::~impl() ../../gdbsupport/task-group.cc:38

0x614000002128 is located 232 bytes inside of 408-byte region [0x614000002040,0x6140000021d8)
freed by thread T0 here:
    #0 0x7fd75ccf8ea5 in operator delete(void*, unsigned long) ../../.././libsanitizer/asan/asan_new_delete.cc:177
    #1 0x9462e5 in cooked_index::index_for_writing() ../../gdb/dwarf2/cooked-index.h:689
    #2 0x9462e5 in operator() ../../gdb/dwarf2/cooked-index.c:657
    #3 0x9462e5 in _M_invoke /gcc/9/include/c++/9.2.0/bits/std_function.h:300

It's happening because cooked_index_worker::wait always returns true in
this case, which tells cooked_index::wait it can delete the m_state
cooked_index_worker member, but cooked_index_worker::write_to_cache tries
to access it immediately afterwards.

Fixed by making cooked_index_worker::wait only return true if desired_state
is CACHE_DONE, same as if threading was enabled, so m_state will not be
prematurely deleted.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31694
Approved-By: Tom Tromey <tom@tromey.com>
farazs-github pushed a commit that referenced this pull request Jun 26, 2024
…ames

In case a DIE contains a linkage name which cannot be demangled and
a source language name (DW_AT_NAME) exists then we want to display this name
instead of the non-demangeable linkage name.

dwarf2_physname returns the linkage name in case the linkage name
cannot be demangled.  Before this patch we always set the returned physname
as demangled name.  This patch changes this by comparing the value
of physname with the linkage name.  Now after this change in case it is equals
to the linkage name and if DW_AT_NAME exists then this is set as the demangled
name otherwise like before still linkage name is used.

For the reproducer, using the test source file added in this change:
"gdb/testsuite/gdb.dwarf2/dw2-wrong-mangled-name.c"

Here is an example of the DWARF where wrong linkage name is emitted by the
compiler for the "func_demangled_test" function:

subprogram {
    {MACRO_AT_range {func_demangled_test}}
    {linkage_name "_FUNC_WRONG_MANGLED__"}
    {name "func_demangled_test"}
    {external 1 flag}
}
subprogram {
    {MACRO_AT_range {main}}
    {external 1 flag}
    {name main}
    {main_subprogram 1 flag}
}

Before this change for a function having both DIEs DW_AT_name and
DW_AT_LINKAGENAME but with the wrong linkage name info, the backtrace
command shows following:

(gdb) b func_demangled_test
(gdb) r
Breakpoint 1, 0x0000555555555131 in _FUNC_WRONG_MANGLED__ ()
(gdb) backtrace
\#0  0x0000555555555131 in  _FUNC_WRONG_MANGLED__ ()
\#1  0x000055555555514a in main ()

After the change now GDB shows the name emitted by DW_AT_NAME:

(gdb) b func_demangled_test
(gdb) r
Breakpoint 1, 0x0000555555555131 in func_demangled_test ()
(gdb) backtrace
\#0  0x0000555555555131 in func_demangled_test ()
\#1  0x000055555555514a in main ()

A new test is added to verify this change.

Approved-By: Tom Tromey <tom@tromey.com>
farazs-github pushed a commit that referenced this pull request Jul 19, 2024
Similar to the x86_64 testcases, some .s files contain the corresponding
CFI directives.  This helps in validating the synthesized CFI by running
those tests with and without the --scfi=experimental command line
option.

GAS issues some diagnostics, enabled by default, with
--scfi=experimental.  The diagnostics have been added with an intent to
help user correct inadvertent errors in their hand-written asm.  An
error is issued when GAS finds that input asm is not amenable to
accurate CFI synthesis.  The existing scfi-diag-*.s tests in the
gas/testsuite/gas/scfi/x86_64 directory test some SCFI diagnostics
already:

      - (#1) "Warning: SCFI: Asymetrical register restore"
      - (#2) "Error: SCFI: usage of REG_FP as scratch not supported"
      - (#3) "Error: SCFI: unsupported stack manipulation pattern"
      - (#4) "Error: untraceable control flow for func 'XXX'"

In the newly added aarch64 testsuite, further tests for additional
diagnostics have been added:
 - scfi-diag-1.s in this patch highlights an aarch64-specific diagnostic:
   (#5) "Warning: SCFI: ignored probable save/restore op with reg offset"

Additionally, some testcases are added to showcase the (currently)
unsupported patterns, e.g., scfi-unsupported-1.s
        mov     x16, 4384
        sub     sp, sp, x16

gas/testsuite/:
	* gas/scfi/README: Update comment to include aarch64.
	* gas/scfi/aarch64/scfi-aarch64.exp: New file.
	* gas/scfi/aarch64/ginsn-arith-1.l: New test.
	* gas/scfi/aarch64/ginsn-arith-1.s: New test.
	* gas/scfi/aarch64/ginsn-cofi-1.l: New test.
	* gas/scfi/aarch64/ginsn-cofi-1.s: New test.
	* gas/scfi/aarch64/ginsn-ldst-1.l: New test.
	* gas/scfi/aarch64/ginsn-ldst-1.s: New test.
	* gas/scfi/aarch64/scfi-callee-saved-fp-1.d: New test.
	* gas/scfi/aarch64/scfi-callee-saved-fp-1.l: New test.
	* gas/scfi/aarch64/scfi-callee-saved-fp-1.s: New test.
	* gas/scfi/aarch64/scfi-callee-saved-fp-2.d: New test.
	* gas/scfi/aarch64/scfi-callee-saved-fp-2.l: New test.
	* gas/scfi/aarch64/scfi-callee-saved-fp-2.s: New test.
	* gas/scfi/aarch64/scfi-cb-1.d: New test.
	* gas/scfi/aarch64/scfi-cb-1.l: New test.
	* gas/scfi/aarch64/scfi-cb-1.s: New test.
	* gas/scfi/aarch64/scfi-cfg-1.d: New test.
	* gas/scfi/aarch64/scfi-cfg-1.l: New test.
	* gas/scfi/aarch64/scfi-cfg-1.s: New test.
	* gas/scfi/aarch64/scfi-cfg-2.d: New test.
	* gas/scfi/aarch64/scfi-cfg-2.l: New test.
	* gas/scfi/aarch64/scfi-cfg-2.s: New test.
	* gas/scfi/aarch64/scfi-cfg-3.d: New test.
	* gas/scfi/aarch64/scfi-cfg-3.l: New test.
	* gas/scfi/aarch64/scfi-cfg-3.s: New test.
	* gas/scfi/aarch64/scfi-cfg-4.l: New test.
	* gas/scfi/aarch64/scfi-cfg-4.s: New test.
	* gas/scfi/aarch64/scfi-cond-br-1.d: New test.
	* gas/scfi/aarch64/scfi-cond-br-1.l: New test.
	* gas/scfi/aarch64/scfi-cond-br-1.s: New test.
	* gas/scfi/aarch64/scfi-diag-1.l: New test.
	* gas/scfi/aarch64/scfi-diag-1.s: New test.
	* gas/scfi/aarch64/scfi-diag-2.l: New test.
	* gas/scfi/aarch64/scfi-diag-2.s: New test.
	* gas/scfi/aarch64/scfi-diag-3.l: New test.
	* gas/scfi/aarch64/scfi-diag-3.s: New test.
	* gas/scfi/aarch64/scfi-ldrp-1.d: New test.
	* gas/scfi/aarch64/scfi-ldrp-1.l: New test.
	* gas/scfi/aarch64/scfi-ldrp-1.s: New test.
	* gas/scfi/aarch64/scfi-ldrp-2.d: New test.
	* gas/scfi/aarch64/scfi-ldrp-2.l: New test.
	* gas/scfi/aarch64/scfi-ldrp-2.s: New test.
	* gas/scfi/aarch64/scfi-ldstnap-1.d: New test.
	* gas/scfi/aarch64/scfi-ldstnap-1.l: New test.
	* gas/scfi/aarch64/scfi-ldstnap-1.s: New test.
	* gas/scfi/aarch64/scfi-strp-1.d: New test.
	* gas/scfi/aarch64/scfi-strp-1.l: New test.
	* gas/scfi/aarch64/scfi-strp-1.s: New test.
	* gas/scfi/aarch64/scfi-strp-2.d: New test.
	* gas/scfi/aarch64/scfi-strp-2.l: New test.
	* gas/scfi/aarch64/scfi-strp-2.s: New test.
	* gas/scfi/aarch64/scfi-unsupported-1.l: New test.
	* gas/scfi/aarch64/scfi-unsupported-1.s: New test.
	* gas/scfi/aarch64/scfi-unsupported-2.l: New test.
	* gas/scfi/aarch64/scfi-unsupported-2.s: New test.
farazs-github pushed a commit that referenced this pull request Jul 24, 2024
On arm-linux, I run into:
...
PASS: gdb.ada/mi_task_arg.exp: mi runto task_switch.break_me
Expecting: ^(-stack-list-arguments 1[^M
]+)?(\^done,stack-args=\[frame={level="0",args=\[\]},frame={level="1",args=\[{name="<_task>",value="0x[0-9A-Fa-f]+"}(,{name="<_taskL>",value="[0-9]+"})?\]},frame={level="2",args=\[({name="self_id",value="(0x[0-9A-Fa-f]+|<optimized out>)"})?\]},.*[^M
]+[(]gdb[)] ^M
[ ]*)
-stack-list-arguments 1^M
^done,stack-args=[frame={level="0",args=[]},frame={level="1",args=[{name="<_task>",value="0x40bc48"}]},frame={level="2",args=[]}]^M
(gdb) ^M
FAIL: gdb.ada/mi_task_arg.exp: -stack-list-arguments 1 (unexpected output)
...

The problem is that the test-case expects a level 3 frame, but there is none.

This can be reproduced using cli bt:
...
 $ gdb -q -batch outputs/gdb.ada/mi_task_arg/task_switch \
   -ex "b task_switch.break_me" \
   -ex run \
   -ex bt
 Breakpoint 1 at 0x34b4: file task_switch.adb, line 57.

 Thread 3 "my_caller" hit Breakpoint 1, task_switch.break_me () \
   at task_switch.adb:57
 57	      null;
 #0  task_switch.break_me () at task_switch.adb:57
 #1  0x00403424 in task_switch.caller (<_task>=0x40bc48) at task_switch.adb:51
 #2  0xf7f95a08 in ?? () from /lib/arm-linux-gnueabihf/libgnarl-12.so
 Backtrace stopped: previous frame identical to this frame (corrupt stack?)
...

The purpose of the test-case is printing the frame at level 1, so I don't
think we should bother about the presence of the frame at level 3.

Fix this by allowing the backtrace to stop at level 2.

Tested on arm-linux.

Approved-By: Luis Machado <luis.machado@arm.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
farazs-github pushed a commit that referenced this pull request Jul 31, 2024
Since commit b1da98a ("gdb: remove use of alloca in
new_macro_definition"), if cached_argv is empty, we call macro_bcache
with a nullptr data.  This ends up caught by UBSan deep down in the
bcache code:

    $ ./gdb -nx -q --data-directory=data-directory  /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp -readnow
    Reading symbols from /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp...
    Expanding full symbols from /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/macscp/macscp...
    /home/smarchi/src/binutils-gdb/gdb/bcache.c:195:12: runtime error: null pointer passed as argument 2, which is declared to never be null

The backtrace:

    #1  0x00007ffff619a05d in __ubsan::__ubsan_handle_nonnull_arg_abort (Data=<optimized out>) at ../../../../src/libsanitizer/ubsan/ubsan_handlers.cpp:750
    #2  0x000055556337fba2 in gdb::bcache::insert (this=0x62d0000c8458, addr=0x0, length=0, added=0x0) at /home/smarchi/src/binutils-gdb/gdb/bcache.c:195
    #3  0x0000555564b49222 in gdb::bcache::insert<char const*, void> (this=0x62d0000c8458, addr=0x0, length=0, added=0x0) at /home/smarchi/src/binutils-gdb/gdb/bcache.h:158
    #4  0x0000555564b481fa in macro_bcache<char const*> (t=0x62100007ae70, addr=0x0, len=0) at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:117
    #5  0x0000555564b42b4a in new_macro_definition (t=0x62100007ae70, kind=macro_function_like, special_kind=macro_ordinary, argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:573
    #6  0x0000555564b44674 in macro_define_internal (source=0x6210000ab9e0, line=469, name=0x7fffffffa710 "__va_arg_pack", kind=macro_function_like, special_kind=macro_ordinary, argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:777
    #7  0x0000555564b44ae2 in macro_define_function (source=0x6210000ab9e0, line=469, name=0x7fffffffa710 "__va_arg_pack", argv=std::__debug::vector of length 0, capacity 0, replacement=0x62a00003af3a "__builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/macrotab.c:816
    #8  0x0000555563f62fc8 in parse_macro_definition (file=0x6210000ab9e0, line=469, body=0x62a00003af2a "__va_arg_pack() __builtin_va_arg_pack ()") at /home/smarchi/src/binutils-gdb/gdb/dwarf2/macro.c:203

This can be reproduced by running gdb.base/macscp.exp.  Avoid calling
macro_bcache if the macro doesn't have any arguments.

Change-Id: I33b5a7c3b3a93d5adba98983fcaae9c8522c383d
farazs-github pushed a commit that referenced this pull request Aug 2, 2024
Some flavors of indirect call and jmp instructions were not being
handled earlier, leading to a GAS error (#1):
  (#1) "Error: SCFI: unhandled op 0xff may cause incorrect CFI"

Not handling jmp/call (direct or indirect) ops is an error (as shown
above) because SCFI needs an accurate CFG to synthesize CFI correctly.
Recall that the presence of indirect jmp/call, however, does make the
CFG ineligible for SCFI. In other words, generating the ginsns for them
now, will eventually cause SCFI to bail out later with an error (#2)
anyway:
  (#2) "Error: untraceable control flow for func 'XXX'"

The first error (#1) gives the impression of missing functionality in
GAS.  So, it seems cleaner to synthesize a GINSN_TYPE_JUMP /
GINSN_TYPE_CALL now in the backend, and let SCFI machinery complain with
the error as expected.

The handling for these indirect jmp/call instructions is similar, so
reuse the code by carving out a function for the same.

Adjust the testcase to include the now handled jmp/call instructions as
well.

gas/
	* config/tc-i386-ginsn.c (x86_ginsn_indirect_branch): New
	function.
	(x86_ginsn_new): Refactor out functionality to above.

gas/testsuite/
	* gas/scfi/x86_64/ginsn-cofi-1.l: Adjust the output.
	* gas/scfi/x86_64/ginsn-cofi-1.s: Add further varieties of
	jmp/call opcodes.
farazs-github pushed a commit that referenced this pull request Sep 5, 2024
With test-case gdb.dwarf2/dw2-lines.exp on arm-linux, I run into:
...
(gdb) break bar_label^M
Breakpoint 2 at 0x4004f6: file dw2-lines.c, line 29.^M
(gdb) continue^M
Continuing.^M
^M
Breakpoint 2, bar () at dw2-lines.c:29^M
29        foo (2);^M
(gdb) PASS: $exp: cv=2: cdw=32: lv=2: ldw=32: continue to breakpoint: foo \(1\)
...

The pass is incorrect because the continue lands at line 29 with "foo (2)"
instead of line line 27 with "foo (1)".

A minimal version is:
...
$ gdb -q -batch dw2-lines.cv-2-cdw-32-lv-2-ldw-32 -ex "b bar_label"
Breakpoint 1 at 0x4f6: file dw2-lines.c, line 29.
...
where:
...
000004ec <bar>:
 4ec:	b580      	push	{r7, lr}
 4ee:	af00      	add	r7, sp, #0

000004f0 <bar_label>:
 4f0:	2001      	movs	r0, #1
 4f2:	f7ff fff1 	bl	4d8 <foo>

000004f6 <bar_label_2>:
 4f6:	2002      	movs	r0, #2
 4f8:	f7ff ffee 	bl	4d8 <foo>
...

So, how does this happen?  In short:
- skip_prologue_sal calls arm_skip_prologue with pc == 0x4ec,
- thumb_analyze_prologue returns 0x4f2
  (overshooting by 1 insn, PR tdep/31981), and
- skip_prologue_sal decides that we're mid-line, and updates to 0x4f6.

However, this is a test-case about .debug_line info, so why didn't arm_skip_prologue
use the line info to skip the prologue?

The answer is that the line info starts at bar_label, not at bar.

Fixing that allows us to work around PR tdep/31981.

Likewise in gdb.dwarf2/dw2-line-number-zero.exp.

Instead, add a new test-case gdb.arch/skip-prologue.exp that is dedicated to
checking quality of architecture-specific prologue analysis, without being
written in an architecture-specific way.

If fails on arm-linux for both marm and mthumb:
...
FAIL: gdb.arch/skip-prologue.exp: f2: $bp_addr == $prologue_end_addr (skipped too much)
FAIL: gdb.arch/skip-prologue.exp: f4: $bp_addr == $prologue_end_addr (skipped too much)
...
and passes for:
- x86_64-linux for {m64,m32}x{-fno-PIE/-no-pie,-fPIE/-pie}
- aarch64-linux.

Tested on arm-linux.
farazs-github pushed a commit that referenced this pull request Sep 9, 2024
The commit:

  commit c6b4867
  Date:   Thu Mar 30 19:21:22 2023 +0100

      gdb: parse pending breakpoint thread/task immediately

Introduce a use bug where the value of a temporary variable was being
used after it had gone out of scope.  This was picked up by the
address sanitizer and would result in this error:

  (gdb) maintenance selftest create_breakpoint_parse_arg_string
  Running selftest create_breakpoint_parse_arg_string.
  =================================================================
  ==2265825==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7fbb08046511 at pc 0x000001632230 bp 0x7fff7c2fb770 sp 0x7fff7c2fb768
  READ of size 1 at 0x7fbb08046511 thread T0
      #0 0x163222f in create_breakpoint_parse_arg_string(char const*, std::unique_ptr<char, gdb::xfree_deleter<char> >*, int*, int*, int*, std::unique_ptr<char, gdb::xfree_deleter<char> >*, bool*) ../../src/gdb/break-cond-parse.c:496
      #1 0x1633026 in test ../../src/gdb/break-cond-parse.c:582
      #2 0x163391b in create_breakpoint_parse_arg_string_tests ../../src/gdb/break-cond-parse.c:649
      #3 0x12cfebc in void std::__invoke_impl<void, void (*&)()>(std::__invoke_other, void (*&)()) /usr/include/c++/13/bits/invoke.h:61
      #4 0x12cc8ee in std::enable_if<is_invocable_r_v<void, void (*&)()>, void>::type std::__invoke_r<void, void (*&)()>(void (*&)()) /usr/include/c++/13/bits/invoke.h:111
      #5 0x12c81e5 in std::_Function_handler<void (), void (*)()>::_M_invoke(std::_Any_data const&) /usr/include/c++/13/bits/std_function.h:290
      #6 0x18bb51d in std::function<void ()>::operator()() const /usr/include/c++/13/bits/std_function.h:591
      #7 0x4193ef9 in selftests::run_tests(gdb::array_view<char const* const>, bool) ../../src/gdbsupport/selftest.cc:100
      #8 0x21c2206 in maintenance_selftest ../../src/gdb/maint.c:1172
      ... etc ...

The problem was caused by three lines like this one:

  thread_info *thr
    = parse_thread_id (std::string (t.get_value ()).c_str (), &tmptok);

After parsing the thread-id TMPTOK would be left pointing into the
temporary string which had been created on this line.  When on the
next line we did this:

  gdb_assert (*tmptok == '\0');

The value of *TMPTOK is undefined.

Fix this by creating the std::string earlier in the scope.  Now the
contents of the string will remain valid when we check *TMPTOK.  The
address sanitizer issue is now resolved.
farazs-github pushed a commit that referenced this pull request Sep 13, 2024
The binary provided with bug 32165 [1] has 36139 ELF sections.  GDB
crashes on it with (note that my GDB is build with -D_GLIBCXX_DEBUG=1:

    $ ./gdb  -nx -q --data-directory=data-directory ./vmlinux
    Reading symbols from ./vmlinux...
    (No debugging symbols found in ./vmlinux)
    (gdb) info func
    /usr/include/c++/14.2.1/debug/vector:508:
    In function:
        std::debug::vector<_Tp, _Allocator>::reference std::debug::vector<_Tp,
        _Allocator>::operator[](size_type) [with _Tp = long unsigned int;
        _Allocator = std::allocator<long unsigned int>; reference = long
        unsigned int&; size_type = long unsigned int]

    Error: attempt to subscript container with out-of-bounds index -29445, but
    container only holds 36110 elements.

    Objects involved in the operation:
        sequence "this" @ 0x514000007340 {
          type = std::debug::vector<unsigned long, std::allocator<unsigned long> >;
        }

The crash occurs here:

    #3  0x00007ffff5e334c3 in __GI_abort () at abort.c:79
    #4  0x00007ffff689afc4 in __gnu_debug::_Error_formatter::_M_error (this=<optimized out>) at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/debug.cc:1320
    #5  0x0000555561119a16 in std::__debug::vector<unsigned long, std::allocator<unsigned long> >::operator[] (this=0x514000007340, __n=18446744073709522171)
        at /usr/include/c++/14.2.1/debug/vector:508
    #6  0x0000555562e288e8 in minimal_symbol::value_address (this=0x5190000bb698, objfile=0x514000007240) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:517
    #7  0x0000555562e5a131 in global_symbol_searcher::expand_symtabs (this=0x7ffff0f5c340, objfile=0x514000007240, preg=std::optional [no contained value])
        at /home/smarchi/src/binutils-gdb/gdb/symtab.c:4983
    #8  0x0000555562e5d2ed in global_symbol_searcher::search (this=0x7ffff0f5c340) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5189
    bminor#9  0x0000555562e5ffa4 in symtab_symbol_info (quiet=false, exclude_minsyms=false, regexp=0x0, kind=FUNCTION_DOMAIN, t_regexp=0x0, from_tty=1)
        at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5361
    bminor#10 0x0000555562e6131b in info_functions_command (args=0x0, from_tty=1) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5525

That is, at this line of `minimal_symbol::value_address`, where
`objfile->section_offsets` is an `std::vector`:

    return (CORE_ADDR (this->unrelocated_address ())
	    + objfile->section_offsets[this->section_index ()]);

A section index of -29445 is suspicious.  The minimal_symbol at play
here is:

    (top-gdb) p m_name
    $1 = 0x521001de10af "_sinittext"

So I restarted debugging, breaking on:

   (top-gdb) b general_symbol_info::set_section_index if $_streq("_sinittext", m_name)

And I see that weird -29445 value:

    (top-gdb) frame
    #0  general_symbol_info::set_section_index (this=0x525000082390, idx=-29445) at /home/smarchi/src/binutils-gdb/gdb/symtab.h:611
    611       { m_section = idx; }

But going up one frame, the section index is 36091:

    (top-gdb) frame
    #1  0x0000555562426526 in minimal_symbol_reader::record_full (this=0x7ffff0ead560, name="_sinittext", copy_name=false,
        address=-2111475712, ms_type=mst_text, section=36091) at /home/smarchi/src/binutils-gdb/gdb/minsyms.c:1228
    1228      msymbol->set_section_index (section);

It seems like the problem is just that the type used for the section
index (short) is not big enough.  Change from short to int.  If somebody
insists, we could even go long long / int64_t, but I doubt it's
necessary.

With that fixed, I get:

    (gdb) info func
    All defined functions:

    Non-debugging symbols:
    0xffffffff81000000  _stext
    0xffffffff82257000  _sinittext
    0xffffffff822b4ebb  _einittext

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=32165

Change-Id: Icb1c3de9474ff5adef7e0bbbf5e0b67b279dee04
Reviewed-By: Tom de Vries <tdevries@suse.de>
Reviewed-by: Keith Seitz <keiths@redhat.com>
farazs-github pushed a commit that referenced this pull request Oct 18, 2024
When building gdb with gcc 12 and -fsanitize=threads while renabling
background dwarf reading by setting dwarf_synchronous to false, I run into:
...
(gdb) file amd64-watchpoint-downgrade
Reading symbols from amd64-watchpoint-downgrade...
(gdb) watch global_var
==================
WARNING: ThreadSanitizer: data race (pid=20124)
  Read of size 8 at 0x7b80000500d8 by main thread:
    #0 cooked_index_entry::full_name(obstack*, bool) const cooked-index.c:220
    #1 cooked_index::get_main_name(obstack*, language*) const cooked-index.c:735
    #2 cooked_index_worker::wait(cooked_state, bool) cooked-index.c:559
    #3 cooked_index::wait(cooked_state, bool) cooked-index.c:631
    #4 cooked_index_functions::wait(objfile*, bool) cooked-index.h:729
    #5 cooked_index_functions::compute_main_name(objfile*) cooked-index.h:806
    #6 objfile::compute_main_name() symfile-debug.c:461
    #7 find_main_name symtab.c:6503
    #8 main_language() symtab.c:6608
    bminor#9 set_initial_language_callback symfile.c:1634
    bminor#10 get_current_language() language.c:96
    ...

  Previous write of size 8 at 0x7b80000500d8 by thread T1:
    #0 cooked_index_shard::finalize(parent_map_map const*) \
         dwarf2/cooked-index.c:409
    #1 operator() cooked-index.c:663
    ...

  ...

SUMMARY: ThreadSanitizer: data race cooked-index.c:220 in \
  cooked_index_entry::full_name(obstack*, bool) const
==================
Hardware watchpoint 1: global_var
(gdb) PASS: gdb.arch/amd64-watchpoint-downgrade.exp: watch global_var
...

This was also reported in PR31715.

This is due do gcc PR110799 [1], generating wrong code with
-fhoist-adjacent-loads, and causing a false positive for
-fsanitize=threads.

Work around the gcc PR by forcing -fno-hoist-adjacent-loads for gcc <= 13
and -fsanitize=threads.

Tested in that same configuration on x86_64-linux.  Remaining ThreadSanitizer
problems are the ones reported in PR31626 (gdb.rust/dwindex.exp) and
PR32247 (gdb.trace/basic-libipa.exp).

PR gdb/31715
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31715

Tested-By: Bernd Edlinger <bernd.edlinger@hotmail.de>
Approved-By: Tom Tromey <tom@tromey.com>

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799
farazs-github pushed a commit that referenced this pull request Oct 29, 2024
When calling a function with double arguments, I get this asan error:

==7920==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x0053131ece38 at pc 0x7ff79697a68f bp 0x0053131ec790 sp 0x0053131ebf40
READ of size 16 at 0x0053131ece38 thread T0
    #0 0x7ff79697a68e in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long long), void const*, void const*, unsigned long long) C:/gcc/src/gcc-14.2.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:814
    #1 0x7ff79697aebd in memcmp C:/gcc/src/gcc-14.2.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:845
    #2 0x7ff79697aebd in memcmp C:/gcc/src/gcc-14.2.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:840
    #3 0x7ff7927e237f in regcache::raw_write(int, gdb::array_view<unsigned char const>) C:/gdb/src/gdb.git/gdb/regcache.c:874
    #4 0x7ff7927e3c85 in regcache::cooked_write(int, gdb::array_view<unsigned char const>) C:/gdb/src/gdb.git/gdb/regcache.c:914
    #5 0x7ff7927e5d89 in regcache::cooked_write(int, unsigned char const*) C:/gdb/src/gdb.git/gdb/regcache.c:933
    #6 0x7ff7911d5965 in amd64_windows_store_arg_in_reg C:/gdb/src/gdb.git/gdb/amd64-windows-tdep.c:216

Address 0x0053131ece38 is located in stack of thread T0 at offset 40 in frame
    #0 0x7ff7911d565f in amd64_windows_store_arg_in_reg C:/gdb/src/gdb.git/gdb/amd64-windows-tdep.c:208

  This frame has 4 object(s):
    [32, 40) 'buf' (line 211) <== Memory access at offset 40 overflows this variable

It's because the first 4 double arguments are passed via XMM registers,
and they need a buffer of 16 bytes, even if we only use 8 bytes of them.

Approved-By: Tom Tromey <tom@tromey.com>
farazs-github pushed a commit that referenced this pull request Nov 1, 2024
On Windows gcore is not implemented, and if you try it, you get an
heap-use-after-free error:

(gdb) gcore C:/gdb/build64/gdb-git-python3/gdb/testsuite/outputs/gdb.base/gcore-buffer-overflow/gcore-buffer-overflow.test
warning: cannot close "=================================================================
==10108==ERROR: AddressSanitizer: heap-use-after-free on address 0x1259ea503110 at pc 0x7ff6806e3936 bp 0x0062e01ed990 sp 0x0062e01ed140
READ of size 111 at 0x1259ea503110 thread T0
    #0 0x7ff6806e3935 in strlen C:/gcc/src/gcc-14.2.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:391
    #1 0x7ff6807169c4 in __pformat_puts C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_pformat.c:558
    #2 0x7ff6807186c1 in __mingw_pformat C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_pformat.c:2514
    #3 0x7ff680713614 in __mingw_vsnprintf C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_vsnprintf.c:41
    #4 0x7ff67f34419f in vsnprintf(char*, unsigned long long, char const*, char*) C:/msys64/mingw64/x86_64-w64-mingw32/include/stdio.h:484
    #5 0x7ff67f34419f in string_vprintf[abi:cxx11](char const*, char*) C:/gdb/src/gdb.git/gdbsupport/common-utils.cc:106
    #6 0x7ff67b37b739 in cli_ui_out::do_message(ui_file_style const&, char const*, char*) C:/gdb/src/gdb.git/gdb/cli-out.c:227
    #7 0x7ff67ce3d030 in ui_out::call_do_message(ui_file_style const&, char const*, ...) C:/gdb/src/gdb.git/gdb/ui-out.c:571
    #8 0x7ff67ce4255a in ui_out::vmessage(ui_file_style const&, char const*, char*) C:/gdb/src/gdb.git/gdb/ui-out.c:740
    bminor#9 0x7ff67ce2c873 in ui_file::vprintf(char const*, char*) C:/gdb/src/gdb.git/gdb/ui-file.c:73
    bminor#10 0x7ff67ce7f83d in gdb_vprintf(ui_file*, char const*, char*) C:/gdb/src/gdb.git/gdb/utils.c:1881
    bminor#11 0x7ff67ce7f83d in vwarning(char const*, char*) C:/gdb/src/gdb.git/gdb/utils.c:181
    bminor#12 0x7ff67f3530eb in warning(char const*, ...) C:/gdb/src/gdb.git/gdbsupport/errors.cc:33
    bminor#13 0x7ff67baed27f in gdb_bfd_close_warning C:/gdb/src/gdb.git/gdb/gdb_bfd.c:437
    bminor#14 0x7ff67baed27f in gdb_bfd_close_or_warn C:/gdb/src/gdb.git/gdb/gdb_bfd.c:646
    #15 0x7ff67baed27f in gdb_bfd_unref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.c:739
    #16 0x7ff68094b6f2 in gdb_bfd_ref_policy::decref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.h:82
    #17 0x7ff68094b6f2 in gdb::ref_ptr<bfd, gdb_bfd_ref_policy>::~ref_ptr() C:/gdb/src/gdb.git/gdbsupport/gdb_ref_ptr.h:91
    #18 0x7ff67badf4d2 in gcore_command C:/gdb/src/gdb.git/gdb/gcore.c:176

0x1259ea503110 is located 16 bytes inside of 4064-byte region [0x1259ea503100,0x1259ea5040e0)
freed by thread T0 here:
    #0 0x7ff6806b1687 in free C:/gcc/src/gcc-14.2.0/libsanitizer/asan/asan_malloc_win.cpp:90
    #1 0x7ff67f2ae807 in objalloc_free C:/gdb/src/gdb.git/libiberty/objalloc.c:187
    #2 0x7ff67d7f56e3 in _bfd_free_cached_info C:/gdb/src/gdb.git/bfd/opncls.c:247
    #3 0x7ff67d7f2782 in _bfd_delete_bfd C:/gdb/src/gdb.git/bfd/opncls.c:180
    #4 0x7ff67d7f5df9 in bfd_close_all_done C:/gdb/src/gdb.git/bfd/opncls.c:960
    #5 0x7ff67d7f62ec in bfd_close C:/gdb/src/gdb.git/bfd/opncls.c:925
    #6 0x7ff67baecd27 in gdb_bfd_close_or_warn C:/gdb/src/gdb.git/gdb/gdb_bfd.c:643
    #7 0x7ff67baecd27 in gdb_bfd_unref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.c:739
    #8 0x7ff68094b6f2 in gdb_bfd_ref_policy::decref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.h:82
    bminor#9 0x7ff68094b6f2 in gdb::ref_ptr<bfd, gdb_bfd_ref_policy>::~ref_ptr() C:/gdb/src/gdb.git/gdbsupport/gdb_ref_ptr.h:91
    bminor#10 0x7ff67badf4d2 in gcore_command C:/gdb/src/gdb.git/gdb/gcore.c:176

It happens because gdb_bfd_close_or_warn uses a bfd-internal name for
the failing-close warning, after the close is finished, and the name
already freed:

static int
gdb_bfd_close_or_warn (struct bfd *abfd)
{
  int ret;
  const char *name = bfd_get_filename (abfd);

  for (asection *sect : gdb_bfd_sections (abfd))
    free_one_bfd_section (sect);

  ret = bfd_close (abfd);

  if (!ret)
    gdb_bfd_close_warning (name,
			   bfd_errmsg (bfd_get_error ()));

  return ret;
}

Fixed by making a copy of the name for the warning.

Approved-By: Andrew Burgess <aburgess@redhat.com>
farazs-github pushed a commit that referenced this pull request Dec 5, 2024
After the commit:

  commit b9de07a
  Date:   Thu Oct 10 11:37:34 2024 +0100

      gdb: fix handling of DW_AT_entry_pc of inlined subroutines

GDB's buildbot CI testing highlighted this assertion failure:

  (gdb) c
  Continuing.
  ../../binutils-gdb/gdb/block.h:203: internal-error: set_entry_pc: Assertion `start >= this->start () && start < this->end ()' failed.
  A problem internal to GDB has been detected,
  further debugging may prove unreliable.
  ----- Backtrace -----
  FAIL: gdb.base/break-probes.exp: run til our library loads (GDB internal error)

This assertion was in the new function set_entry_pc and is asserting
that the default_entry_pc() value is within the blocks start/end
range.

The default_entry_pc() is the value GDB will use as the entry-pc if
the DWARF doesn't specifically override the entry-pc.  This value is
calculated as:

  1. The start address of the first sub-range within the block, if the
  block has more than 1 range, or

  2. The low address (from DW_AT_low_pc) for the block.

If the block only has a single range then this means the block was
defined with low/high pc attributes (case #2 above).  These low/high
pc values are what block::start() and block::end() return.  This means
that by definition, if the block is continuous, the above assert
cannot trigger as 'start', the default_entry_pc() would be equivalent
to block::start().

This means that, for the assert to trigger, the block must have
multiple ranges, and the first address of the first range is not
within the blocks low/high address range.  This seems wrong.

I inspected the state at the time the assert triggered and discovered
the block's start() address.  Then I removed the assert and restarted
GDB.  I was now able to inspect the blocks at the offending address:

  (gdb) maintenance info blocks 0x7ffff7dddaa4
  Blocks at 0x7ffff7dddaa4:
    from objfile: [(objfile *) 0x44a37f0] /lib64/ld-linux-x86-64.so.2
 [(block *) 0x46b30c0] 0x7ffff7ddd5a0..0x7ffff7dde8a6
    entry pc: 0x7ffff7ddd5a0
    is global block
    symbol count: 4
    is contiguous
  [(block *) 0x46b3020] 0x7ffff7ddd5a0..0x7ffff7dde8a6
    entry pc: 0x7ffff7ddd5a0
    is static block
    symbol count: 9
    is contiguous
  [(block *) 0x46b2f70] 0x7ffff7ddda00..0x7ffff7dddac3
    entry pc: 0x7ffff7ddda00
    function: __GI__dl_find_dso_for_object
    symbol count: 4
    is contiguous
  [(block *) 0x46b2e10] 0x7ffff7dddaa4..0x7ffff7dddac3
    entry pc: 0x7ffff7dddaa4
    inline function: __GI__dl_find_dso_for_object
    symbol count: 5
    is contiguous
  [(block *) 0x46b2a40] 0x7ffff7dddaa4..0x7ffff7dddac3
    entry pc: 0x7ffff7dddaa4
    symbol count: 1
    is contiguous
  [(block *) 0x46b2970] 0x7ffff7dddaa4..0x7ffff7dddac3
    entry pc: 0x7ffff7dddaa4
    symbol count: 2
    address ranges:
      0x7ffff7ddda0e..0x7ffff7ddda77
      0x7ffff7ddda90..0x7ffff7ddda96

I've left everything in for context, but the only really interesting
bit is the very last block, it's low/high range is:

  0x7ffff7dddaa4..0x7ffff7dddac3

but it has separate ranges:

  0x7ffff7ddda0e..0x7ffff7ddda77
  0x7ffff7ddda90..0x7ffff7ddda96

which are all outside the low/high range.  This is what triggers the
assert.  But why does that block exist at all?

What I believe is happening is that we're running into a bug in older
versions of GCC.  The buildbot failure was with an 8.5 gcc, and Tom de
Vries also reported seeing failures when using version 7 and 8 gcc,
but not with gcc 9 and onward.

Looking at the DWARF I can see that the problematic block is created
from this DIE:

  <4><15efb>: Abbrev Number: 83 (DW_TAG_lexical_block)
     <15efc>   DW_AT_abstract_origin: <0x15e9f>
     <15efe>   DW_AT_low_pc      : 0x7ffff7dddaa4
     <15f06>   DW_AT_high_pc     : 31

which links via DW_AT_abstract_origin to:

  <2><15e9f>: Abbrev Number: 80 (DW_TAG_lexical_block)
     <15ea0>   DW_AT_ranges      : 0x38e0
     <15ea4>   DW_AT_sibling     : <0x15eca>

And so we can see that <15efb> has got both low/high pc attributes and
a ranges attribute.

If I widen my checking to parents of DIE <15efb> then I see that they
also have DW_AT_abstract_origin, however, there is something
interesting going on, the parent DIEs are linking to a different DIE
tree than <15efb>.

What I believe is happening is this, we have an abstract instance
tree, this is rooted at a DW_AT_subprogram, and contains all the
blocks, variables, parameters, etc, that you would expect.  As this is
an abstract instance, then there are no low/high pc attributes, and no
ranges attributes in this tree.  This makes sense.

Now elsewhere we have a DW_TAG_subprogram (not
DW_TAG_inlined_subroutine) which links via
DW_AT_abstract_origin to the abstract DW_AT_subprogram.  This case is
documented in the DWARF 5 spec in section 3.3.8.3, and describes an
Out-of-Line Instance of an Inlined Subroutine.  Within this out of
line instance many of the DIE correctly link back, using
DW_AT_abstract_origin to the abstract instance tree.  This tree also
includes the DIE <15e9f>, which is where our problem DIE references.

Now, to really confuse things, within this out-of-line instance we
have a DW_TAG_inlined_subroutine, which is another instance of the
same abstract instance tree!  This would seem to indicate a recursive
call to the inline function, and the compiler, for some reason, needed
to instantiate an out of line instance of this function.

And it is within this nested, inlined subroutine, that the problem DIE
exists.  The problem DIE is referencing the corresponding DIE within
the out of line instance tree, but I am convinced this must be a (long
fixed) GCC bug, and that the problem DIE should be referencing the DIE
within the abstract instance tree.

I'm aware that the above is pretty confusing.  The actual DWARF would
be a around 200 lines long, so I'd like to avoid dumping it in here.
But here's my attempt at representing what's going on in a minimal
example.  The numbers down the side represent the section offset, not
the nesting level, and I've removed any attributes that are not
relevant:

  <1> DW_TAG_subprogram
  <2>   DW_TAG_lexical_block
  <3> DW_TAG_subprogram
        DW_AT_abstract_origin <1>
  <4>   DW_TAG_lexical_block
          DW_AT_ranges ...
  <5>   DW_TAG_inlined_subroutine
          DW_AT_abstract_origin <1>
  <6>     DW_TAG_lexical_block
            DW_AT_abstract_origin <4>
            DW_AT_low_pc ...
            DW_AT_high_pc ...

The lexical block at <6> is linking to <4> when it should be linking
to <2>.

There is one additional thing that we might wonder about, which is,
when calculating the low/high pc range for a block, why does GDB not
make use of the range information and expand the range beyond the
defined low/high values?

The answer to this is in dwarf_get_pc_bounds_ranges_or_highlow_pc in
dwarf/read.c.  This is where the low/high bounds are calculated.  What
we see is that GDB first checks for a low/high attribute pair, and if
that is present, this defines the address range for the block.  Only
if there is no DW_AT_low_pc do we check for the DW_AT_ranges, and use
that to define the extent of the block.  And this makes sense, section
3.5 of the DWARF-5 spec says:

  The lexical block entry may have either a DW_AT_low_pc and DW_AT_high_pc
  pair of attributes or a DW_AT_ranges attribute whose values encode the
  contiguous or non-contiguous address ranges, respectively, of the machine
  instructions generated for the lexical block...

Section 3.5 is specifically about lexical blocks, but the same
wording, about it being either low/high OR ranges is repeated for
other DW_TAG_ types.

So this explains why GDB doesn't use the ranges to expand the problem
blocks ranges; as the first DIE has low/high addresses, these are
used, and the ranges is not consulted.

It is only later in dwarf2_record_block_ranges that we create a range
based off the low/high pc, and then also process the ranges data, this
allows the problem block to exist with ranges that are outside the
low/high range.

To solve this I considered a number of options:

1. Prevent loading certain attributes from an abstract instance.

Section 3.3.8.1 of the DWARF-5 spec talks about which attributes are
appropriate to place in an abstract instance.  Any attribute that
might vary between instances should not appear in an abstract
instance.  DW_AT_ranges is included as an example in the
non-exhaustive list of attributes that should not appear in an
abstract instance.

Currently in dwarf2_attr (dwarf2/read.c), when we see a
DW_AT_abstract_origin attribute, we always follow this to try and find
the attribute we are looking for.  But we could change this function
so that we prevent this following for attributes that we know should
not be looked up in an abstract instance.  This would solve the
problem in this case by preventing us finding the DW_AT_ranges in the
incorrect abstract instance.

2. Filter the ranges.

Having established a blocks low/high address range in
dwarf_get_pc_bounds_ranges_or_highlow_pc, we could allow
dwarf2_record_block_ranges to parse the ranges, but we could reject
any range that extends outside the blocks defined start and end
addresses.

For well behaved DWARF where we have either low/high or ranges, then
the blocks start/end are defined from the range data, and so, by
definition, every range would be acceptable.

But in our problem case we would reject all of the invalid ranges.

This is my least favourite solution as it feels like rejecting the
ranges is tackling the problem too late on.

3. Don't try to parse ranges when we have low/high attributes.

This option involves updating dwarf2_record_block_ranges to match the
behaviour of dwarf_get_pc_bounds_ranges_or_highlow_pc, and, I believe,
to match the DWARF spec: don't try to read range data from
DW_AT_ranges if we have low/high pc attributes.

In our case this solves the issue because the problematic DIE has the
low/high attributes, and it then links to the wrong DIE which happens
to have DW_AT_ranges.  With this change in place we don't even look
for the DW_AT_ranges.

If the problem were reversed, and the initial DIE had DW_AT_ranges,
but the incorrectly referenced DIE had the low/high pc attributes,
we would pick up the wrong addresses, but this wouldn't trigger any
asserts.  The reason is that dwarf_get_pc_bounds_ranges_or_highlow_pc
would also find the low/high addresses from the incorrectly referenced
DIE, and so we would just end up with a block which had the wrong
address ranges, but the block would be self consistent, which is
different to the problem we hit here.

In the end, in this commit I went with solution #3, having
dwarf_get_pc_bounds_ranges_or_highlow_pc and
dwarf2_record_block_ranges be consistent seems sensible.  However, I
do wonder if in the future we might want to explore solution #1 as an
additional safety feature.

With this patch in place I'm able to run the gdb.base/break-probes.exp
without seeing the assert that CI testing highlighted.  I see no
regressions when testing on x86-64 GNU/Linux with gcc  9.3.1.

Note: the diff in this commit looks big, but it's really just me
indenting the code.

Approved-By: Tom Tromey <tom@tromey.com>
farazs-github pushed a commit that referenced this pull request Dec 14, 2024
When building gdb with -fsanitize=thread and running test-case
gdb.base/bg-exec-sigint-bp-cond.exp, I run into:
...
==================^M
WARNING: ThreadSanitizer: signal handler spoils errno (pid=25422)^M
    #0 handler_wrapper gdb/posix-hdep.c:66^M
    #1 decltype ({parm#2}({parm#3}...)) gdb::handle_eintr<>() \
         gdbsupport/eintr.h:67^M
    #2 gdb::waitpid(int, int*, int) gdbsupport/eintr.h:78^M
    #3 run_under_shell gdb/cli/cli-cmds.c:926^M
...

Likewise in:
- tui_sigwinch_handler with test-case gdb.python/tui-window.exp, and
- handle_sighup with test-case gdb.base/quit-live.exp.

Fix this by saving the original errno, and restoring it before returning [1].

Tested on x86_64-linux.

Approved-By: Tom Tromey <tom@tromey.com>

[1] https://www.gnu.org/software/libc/manual/html_node/POSIX-Safety-Concepts.html
farazs-github pushed a commit that referenced this pull request Dec 21, 2024
This commit adds support for a `gstack' command which Fedora has
been carrying for many years. gstack is a natural counterpart to
the gcore command. Whereas gcore dumps a core file, gstack prints
stack traces of a running process.

There are many improvements over Fedora's version of this script.
The dependency on procfs is gone; gstack will run anywhere gdb
runs. The only runtime dependencies are bash and awk.

The script includes suggestions from gdb/32325 to include
versioning and help. [If this approach to gdb/32325 is acceptable,
I could propagate the solution to gcore/gdb-add-index.]

I've rewritten the documentation, integrating it into the User Manual.
The manpage is now output using this one source.

Example run (on x86_64 Fedora 40)

$ gstack --help
Usage: gstack [-h|--help] [-v|--version] PID
Print a stack trace of a running program

  -h, --help         Print this message then exit.
  -v, --version      Print version information then exit.
$ gstack -v
GNU gstack (GDB) 16.0.50.20241119-git
$ gstack 12345678
Process 12345678 not found.
$ gstack $(pidof emacs)
Thread 6 (Thread 0x7fd5ec1c06c0 (LWP 2491423) "pool-spawner"):
#0  0x00007fd6015ca3dd in syscall () at /lib64/libc.so.6
#1  0x00007fd60b31eccd in g_cond_wait () at /lib64/libglib-2.0.so.0
#2  0x00007fd60b28a61b in g_async_queue_pop_intern_unlocked () at /lib64/libglib-2.0.so.0
#3  0x00007fd60b2f1a03 in g_thread_pool_spawn_thread () at /lib64/libglib-2.0.so.0
#4  0x00007fd60b2f0813 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#5  0x00007fd6015486d7 in start_thread () at /lib64/libc.so.6
#6  0x00007fd6015cc60c in clone3 () at /lib64/libc.so.6
#7  0x0000000000000000 in ??? ()

Thread 5 (Thread 0x7fd5eb9bf6c0 (LWP 2491424) "gmain"):
#0  0x00007fd6015be87d in poll () at /lib64/libc.so.6
#1  0x0000000000000001 in ??? ()
#2  0xffffffff00000001 in ??? ()
#3  0x0000000000000001 in ??? ()
#4  0x000000002104cfd0 in ??? ()
#5  0x00007fd5eb9be320 in ??? ()
#6  0x00007fd60b321c34 in g_main_context_iterate_unlocked.isra () at /lib64/libglib-2.0.so.0

Thread 4 (Thread 0x7fd5eb1be6c0 (LWP 2491425) "gdbus"):
#0  0x00007fd6015be87d in poll () at /lib64/libc.so.6
#1  0x0000000020f9b558 in ??? ()
#2  0xffffffff00000003 in ??? ()
#3  0x0000000000000003 in ??? ()
#4  0x00007fd5d8000b90 in ??? ()
#5  0x00007fd5eb1bd320 in ??? ()
#6  0x00007fd60b321c34 in g_main_context_iterate_unlocked.isra () at /lib64/libglib-2.0.so.0

Thread 3 (Thread 0x7fd5ea9bd6c0 (LWP 2491426) "emacs"):
#0  0x00007fd6015ca3dd in syscall () at /lib64/libc.so.6
#1  0x00007fd60b31eccd in g_cond_wait () at /lib64/libglib-2.0.so.0
#2  0x00007fd60b28a61b in g_async_queue_pop_intern_unlocked () at /lib64/libglib-2.0.so.0
#3  0x00007fd60b28a67c in g_async_queue_pop () at /lib64/libglib-2.0.so.0
#4  0x00007fd603f4d0d9 in fc_thread_func () at /lib64/libpangoft2-1.0.so.0
#5  0x00007fd60b2f0813 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#6  0x00007fd6015486d7 in start_thread () at /lib64/libc.so.6
#7  0x00007fd6015cc60c in clone3 () at /lib64/libc.so.6
#8  0x0000000000000000 in ??? ()

Thread 2 (Thread 0x7fd5e9e6d6c0 (LWP 2491427) "dconf worker"):
#0  0x00007fd6015be87d in poll () at /lib64/libc.so.6
#1  0x0000000000000001 in ??? ()
#2  0xffffffff00000001 in ??? ()
#3  0x0000000000000001 in ??? ()
#4  0x00007fd5cc000b90 in ??? ()
#5  0x00007fd5e9e6c320 in ??? ()
#6  0x00007fd60b321c34 in g_main_context_iterate_unlocked.isra () at /lib64/libglib-2.0.so.0

Thread 1 (Thread 0x7fd5fcc45280 (LWP 2491417) "emacs"):
#0  0x00007fd6015c9197 in pselect () at /lib64/libc.so.6
#1  0x0000000000000000 in ??? ()

Since this is essentially a complete rewrite of the original
script and documentation, I've chosen to only keep a 2024 copyright date.

Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Tom Tromey <tom@tromey.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants