Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

finish path parsing #9

Closed
tromey opened this issue Apr 14, 2016 · 2 comments
Closed

finish path parsing #9

tromey opened this issue Apr 14, 2016 · 2 comments
Labels
Milestone

Comments

@tromey
Copy link
Owner

tromey commented Apr 14, 2016

Currently path parsing doesn't handle generics or some other type constructs. It also needs to differentiate between expression and type context.

@tromey tromey added the rust label Apr 14, 2016
@tromey tromey added this to the upstream milestone Apr 14, 2016
@tromey
Copy link
Owner Author

tromey commented Apr 19, 2016

I have the start of a patch for this. I did most of the parser bits so far.

@tromey
Copy link
Owner Author

tromey commented Apr 26, 2016

Fixed.

@tromey tromey closed this as completed Apr 26, 2016
Manishearth pushed a commit to Manishearth/gdb that referenced this issue Jun 14, 2016
Nowadays, read_memory may throw NOT_AVAILABLE_ERROR (it is done by
patch http://sourceware.org/ml/gdb-patches/2013-08/msg00625.html)
however, read_stack and read_code still throws MEMORY_ERROR only.  This
causes PR 19947, that is prologue unwinder is unable unwind because
code memory isn't available, but MEMORY_ERROR is thrown, while unwinder
catches NOT_AVAILABLE_ERROR.

 #0  memory_error (err=err@entry=TARGET_XFER_E_IO, memaddr=memaddr@entry=140737349781158) at /home/yao/SourceCode/gnu/gdb/git/gdb/corefile.c:217
 tromey#1  0x000000000065f5ba in read_code (memaddr=memaddr@entry=140737349781158, myaddr=myaddr@entry=0x7fffffffd7b0 "\340\023<\001", len=len@entry=1)
     at /home/yao/SourceCode/gnu/gdb/git/gdb/corefile.c:288
 tromey#2  0x000000000065f7b5 in read_code_unsigned_integer (memaddr=memaddr@entry=140737349781158, len=len@entry=1, byte_order=byte_order@entry=BFD_ENDIAN_LITTLE)
     at /home/yao/SourceCode/gnu/gdb/git/gdb/corefile.c:363
 tromey#3  0x00000000004717e0 in amd64_analyze_prologue (gdbarch=gdbarch@entry=0x13c13e0, pc=140737349781158, current_pc=140737349781165, cache=cache@entry=0xda0cb0)
     at /home/yao/SourceCode/gnu/gdb/git/gdb/amd64-tdep.c:2267
 tromey#4  0x0000000000471f6d in amd64_frame_cache_1 (cache=0xda0cb0, this_frame=0xda0bf0) at /home/yao/SourceCode/gnu/gdb/git/gdb/amd64-tdep.c:2437
 tromey#5  amd64_frame_cache (this_frame=0xda0bf0, this_cache=<optimised out>) at /home/yao/SourceCode/gnu/gdb/git/gdb/amd64-tdep.c:2508
 tromey#6  0x000000000047214d in amd64_frame_this_id (this_frame=<optimised out>, this_cache=<optimised out>, this_id=0xda0c50)
     at /home/yao/SourceCode/gnu/gdb/git/gdb/amd64-tdep.c:2541
 tromey#7  0x00000000006b94c4 in compute_frame_id (fi=0xda0bf0) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:481
 tromey#8  get_prev_frame_if_no_cycle (this_frame=this_frame@entry=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1809
 tromey#9  0x00000000006bb6c9 in get_prev_frame_always_1 (this_frame=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1983
 tromey#10 get_prev_frame_always (this_frame=this_frame@entry=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1999
 tromey#11 0x00000000006bbe11 in get_prev_frame (this_frame=this_frame@entry=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:2241
 tromey#12 0x00000000006bc13c in unwind_to_current_frame (ui_out=<optimised out>, args=args@entry=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1485

The fix is to let read_stack and read_code throw NOT_AVAILABLE_ERROR too,
in order to align with read_memory.

gdb:

2016-05-04  Yao Qi  <yao.qi@linaro.org>

	PR gdb/19947
	* corefile.c (read_memory): Rename it to ...
	(read_memory_object): ... it.  Add parameter object.
	(read_memory): Call read_memory_object.
	(read_stack): Likewise.
	(read_code): Likewise.
Manishearth pushed a commit to Manishearth/gdb that referenced this issue Jun 14, 2016
When GDB attaches to a process, it looks at the /proc/PID/task/ dir
for all clone threads of that process, and attaches to each of them.

Usually, if there is more than one clone thread, it means the program
is multi threaded and linked with pthreads.  Thus when GDB soon after
attaching finds and loads a libthread_db matching the process, it'll
add a thread to the thread list for each of the initially found
lower-level LWPs.

If, however, GDB fails to find/load a matching libthread_db, nothing
is adding the LWPs to the thread list.  And because of that, "detach"
hits an internal error:

  (gdb) PASS: gdb.threads/clone-attach-detach.exp: fg attach 1: attach
  info threads
    Id   Target Id         Frame
  * 1    LWP 6891 "clone-attach-de" 0x00007f87e5fd0790 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:84
  (gdb) FAIL: gdb.threads/clone-attach-detach.exp: fg attach 1: info threads shows two LWPs
  detach
  .../src/gdb/thread.c:1010: internal-error: is_executing: Assertion `tp' failed.
  A problem internal to GDB has been detected,
  further debugging may prove unreliable.
  Quit this debugging session? (y or n)
  FAIL: gdb.threads/clone-attach-detach.exp: fg attach 1: detach (GDB internal error)

From here:

  ...
  tromey#8  0x00000000007ba7cc in internal_error (file=0x98ea68 ".../src/gdb/thread.c", line=1010, fmt=0x98ea30 "%s: Assertion `%s' failed.")
      at .../src/gdb/common/errors.c:55
  tromey#9  0x000000000064bb83 in is_executing (ptid=...) at .../src/gdb/thread.c:1010
  tromey#10 0x00000000004c23bb in get_pending_status (lp=0x12c5cc0, status=0x7fffffffdc0c) at .../src/gdb/linux-nat.c:1235
  tromey#11 0x00000000004c2738 in detach_callback (lp=0x12c5cc0, data=0x0) at .../src/gdb/linux-nat.c:1317
  tromey#12 0x00000000004c1a2a in iterate_over_lwps (filter=..., callback=0x4c2599 <detach_callback>, data=0x0) at .../src/gdb/linux-nat.c:899
  tromey#13 0x00000000004c295c in linux_nat_detach (ops=0xe7bd30, args=0x0, from_tty=1) at .../src/gdb/linux-nat.c:1358
  tromey#14 0x000000000068284d in delegate_detach (self=0xe7bd30, arg1=0x0, arg2=1) at .../src/gdb/target-delegates.c:34
  tromey#15 0x0000000000694141 in target_detach (args=0x0, from_tty=1) at .../src/gdb/target.c:2241
  tromey#16 0x0000000000630582 in detach_command (args=0x0, from_tty=1) at .../src/gdb/infcmd.c:2975
  ...

Tested on x86-64 Fedora 23.  Also confirmed the test passes against
gdbserver with "maint set target-non-stop".

gdb/ChangeLog:
2016-05-24  Pedro Alves  <palves@redhat.com>

	PR gdb/19828
	* linux-nat.c (attach_proc_task_lwp_callback): Mark the lwp
	resumed, and add the thread to GDB's thread list.

testsuite/ChangeLog:
2016-05-24  Pedro Alves  <palves@redhat.com>

	PR gdb/19828
	* gdb.threads/clone-attach-detach.c: New file.
	* gdb.threads/clone-attach-detach.exp: New file.
Manishearth pushed a commit to Manishearth/gdb that referenced this issue Jun 14, 2016
I see the following GDBserver internal error in two cases,

 gdb/gdbserver/linux-low.c:1922: A problem internal to GDBserver has been detected.
 unsuspend LWP 17200, suspended=-1

 1. step over a breakpoint on fork/vfork syscall instruction,
 2. step over a breakpoint on clone syscall instruction and child
    threads hits a breakpoint,

the stack backtrace is

 #0  internal_error (file=file@entry=0x44c4c0 "gdb/gdbserver/linux-low.c", line=line@entry=1922,
    fmt=fmt@entry=0x44c7d0 "unsuspend LWP %ld, suspended=%d\n") at gdb/gdbserver/../common/errors.c:51
 tromey#1  0x0000000000424014 in lwp_suspended_decr (lwp=<optimised out>, lwp=<optimised out>) at gdb/gdbserver/linux-low.c:1922
 tromey#2  0x000000000042403a in unsuspend_one_lwp (entry=<optimised out>, except=0x66e8c0) at gdb/gdbserver/linux-low.c:2885
 tromey#3  0x0000000000405f45 in find_inferior (list=<optimised out>, func=func@entry=0x424020 <unsuspend_one_lwp>, arg=arg@entry=0x66e8c0)
    at gdb/gdbserver/inferiors.c:243
 tromey#4  0x00000000004297de in unsuspend_all_lwps (except=0x66e8c0) at gdb/gdbserver/linux-low.c:2895
 tromey#5  linux_wait_1 (ptid=..., ourstatus=ourstatus@entry=0x665ec0 <last_status>, target_options=target_options@entry=0)
    at gdb/gdbserver/linux-low.c:3632
 tromey#6  0x000000000042a764 in linux_wait (ptid=..., ourstatus=0x665ec0 <last_status>, target_options=0)
    at gdb/gdbserver/linux-low.c:3770
 tromey#7  0x0000000000411163 in mywait (ptid=..., ourstatus=ourstatus@entry=0x665ec0 <last_status>, options=options@entry=0, connected_wait=connected_wait@entry=1)
    at gdb/gdbserver/target.c:214
 tromey#8  0x000000000040b1f2 in resume (actions=0x66f800, num_actions=1) at gdb/gdbserver/server.c:2757
 tromey#9  0x000000000040f660 in handle_v_cont (own_buf=0x66a630 "vCont;c:p45e9.-1") at gdb/gdbserver/server.c:2719

when GDBserver steps over a thread, other threads have been suspended,
the "stepping" thread may create new thread, but GDBserver doesn't set
it suspend count to 1.  When GDBserver unsuspend threads, the child's
suspend count goes to -1, and the assert is triggered.  In fact, GDBserver
has already taken care of suspend count of new thread when GDBserver is
suspending all threads except the one GDBserver wants to step over by
https://sourceware.org/ml/gdb-patches/2015-07/msg00946.html

+	  /* If we're suspending all threads, leave this one suspended
+	     too.  */
+	  if (stopping_threads == STOPPING_AND_SUSPENDING_THREADS)
+	    {
+	      if (debug_threads)
+		debug_printf ("HEW: leaving child suspended\n");
+	      child_lwp->suspended = 1;
+	    }

but that is not enough, because new thread is still can be spawned in
the thread which is being stepped over.  This patch extends the
condition that GDBserver set child's suspend count to one if it is
suspending threads or stepping over the thread.

gdb/gdbserver:

2016-03-03  Yao Qi  <yao.qi@linaro.org>

	PR server/19736
	* linux-low.c (handle_extended_wait): Set child suspended
	if event_lwp->bp_reinsert isn't zero.
Manishearth pushed a commit to Manishearth/gdb that referenced this issue Jun 14, 2016
Fix this GDB crash:

  $ gdb -ex "set architecture mips:10000"
  Segmentation fault (core dumped)

Backtrace:

  Program received signal SIGSEGV, Segmentation fault.
  0x0000000000495b1b in mips_gdbarch_init (info=..., arches=0x0) at /home/pedro/gdb/mygit/cxx-convertion/src/gdb/mips-tdep.c:8436
  8436              if (bfd_get_flavour (info.abfd) == bfd_target_elf_flavour
  (top-gdb) bt
  #0  0x0000000000495b1b in mips_gdbarch_init (info=..., arches=0x0) at .../src/gdb/mips-tdep.c:8436
  tromey#1  0x00000000007348a6 in gdbarch_find_by_info (info=...) at .../src/gdb/gdbarch.c:5155
  tromey#2  0x000000000073563c in gdbarch_update_p (info=...) at .../src/gdb/arch-utils.c:522
  tromey#3  0x0000000000735585 in set_architecture (ignore_args=0x0, from_tty=1, c=0x26bc870) at .../src/gdb/arch-utils.c:496
  tromey#4  0x00000000005f29fd in do_sfunc (c=0x26bc870, args=0x0, from_tty=1) at .../src/gdb/cli/cli-decode.c:121
  tromey#5  0x00000000005fd3f3 in do_set_command (arg=0x7fffffffdcdd "mips:10000", from_tty=1, c=0x26bc870) at .../src/gdb/cli/cli-setshow.c:455
  tromey#6  0x0000000000836157 in execute_command (p=0x7fffffffdcdd "mips:10000", from_tty=1) at .../src/gdb/top.c:460
  tromey#7  0x000000000071abfb in catch_command_errors (command=0x835f6b <execute_command>, arg=0x7fffffffdccc "set architecture mips:10000", from_tty=1)
      at .../src/gdb/main.c:368
  tromey#8  0x000000000071bf4f in captured_main (data=0x7fffffffd750) at .../src/gdb/main.c:1132
  tromey#9  0x0000000000716737 in catch_errors (func=0x71af44 <captured_main>, func_args=0x7fffffffd750, errstring=0x106b9a1 "", mask=RETURN_MASK_ALL)
      at .../src/gdb/exceptions.c:240
  tromey#10 0x000000000071bfe6 in gdb_main (args=0x7fffffffd750) at .../src/gdb/main.c:1164
  tromey#11 0x000000000040a6ad in main (argc=4, argv=0x7fffffffd858) at .../src/gdb/gdb.c:32
  (top-gdb)

We already check whether info.abfd is NULL before all other
bfd_get_flavour calls in the same function.  Just this one case was
missing.

(This was exposed by a WIP test that tries all "set architecture ARCH"
values.)

gdb/ChangeLog:
2016-03-07  Pedro Alves  <palves@redhat.com>

	* mips-tdep.c (mips_gdbarch_init): Check whether info.abfd is NULL
	before calling bfd_get_flavour.
Manishearth pushed a commit to Manishearth/gdb that referenced this issue Jun 25, 2016
This patch adds some sanity check that reinsert breakpoints must be
there when doing step-over on software single step target.  The check
triggers an assert when running forking-threads-plus-breakpoint.exp
on arm-linux target,

 gdb/gdbserver/linux-low.c:4714: A problem internal to GDBserver has been detected.^M
 int finish_step_over(lwp_info*): Assertion `has_reinsert_breakpoints ()' failed.

the error happens when GDBserver has already resumed a thread of
process A for step-over (and wait for it hitting reinsert breakpoint),
but receives detach request for process B from GDB, which is shown in
the backtrace below,

 (gdb) bt
 tromey#2  0x000228aa in finish_step_over (lwp=0x12bbd98) at /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/linux-low.c:4703
 tromey#3  0x00025a50 in finish_step_over (lwp=0x12bbd98) at /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/linux-low.c:4749
 tromey#4  complete_ongoing_step_over () at /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/linux-low.c:4760
 tromey#5  linux_detach (pid=25228) at /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/linux-low.c:1503
 tromey#6  0x00012bae in process_serial_event () at /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/server.c:3974
 tromey#7  handle_serial_event (err=<optimized out>, client_data=<optimized out>) at /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/server.c:4347
 tromey#8  0x00016d68 in handle_file_event (event_file_desc=<optimized out>) at /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/event-loop.c:429
 tromey#9  0x000173ea in process_event () at /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/event-loop.c:184
 tromey#10 start_event_loop () at /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/event-loop.c:547
 tromey#11 0x0000aa2c in captured_main (argv=<optimized out>, argc=<optimized out>) at /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/server.c:3719
 tromey#12 main (argc=<optimized out>, argv=<optimized out>) at /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/server.c:3804

the sanity check tries to find the reinsert breakpoint from process B,
but nothing is found.  It is wrong, we need to search in process A,
since we started step-over of a thread of process A.

 (gdb) p lwp->thread->entry.id
 $3 = {pid = 25120, lwp = 25131, tid = 0}
 (gdb) p current_thread->entry.id
 $4 = {pid = 25228, lwp = 25228, tid = 0}

This patch switched current_thread to the thread we are doing step-over
in finish_step_over.

gdb/gdbserver:

2016-06-17  Yao Qi  <yao.qi@linaro.org>

	* linux-low.c (maybe_hw_step): New function.
	(linux_resume_one_lwp_throw): Call maybe_hw_step.
	(finish_step_over): Switch current_thread to lwp temporarily,
	and assert has_reinsert_breakpoints returns true.
	(proceed_one_lwp): Call maybe_hw_step.
	* mem-break.c (has_reinsert_breakpoints): New function.
	* mem-break.h (has_reinsert_breakpoints): Declare.
Manishearth pushed a commit to Manishearth/gdb that referenced this issue Aug 31, 2016
This patch fixes a problem that problem triggers if you start an
inferior, e.g., with the "start" command, in a UI created with the
new-ui command, and then run a foreground execution command in the
main UI.  Once the program stops for the latter command, typing in the
main UI no longer echoes back to the user.

The problem revolves around this:

- gdb_has_a_terminal computes its result lazily, on first call.

  that is what saves gdb's initial main UI terminal state (the UI
  associated with stdin):

          our_terminal_info.ttystate = serial_get_tty_state (stdin_serial);

  This is the state that target_terminal_ours() restores.

- In this scenario, the gdb_has_a_terminal function happens to be
  first ever called from within the target_terminal_init call in
  startup_inferior:

      (top-gdb) bt
      #0  gdb_has_a_terminal () at src/gdb/inflow.c:157
      tromey#1  0x000000000079db22 in child_terminal_init_with_pgrp () at src/gdb/inflow.c:217
       [...]
      tromey#4  0x000000000065bacb in target_terminal_init () at src/gdb/target.c:456
      tromey#5  0x00000000004676d2 in startup_inferior () at src/gdb/fork-child.c:531
       [...]
      tromey#7  0x000000000046b168 in linux_nat_create_inferior () at src/gdb/linux-nat.c:1112
       [...]
      tromey#9  0x00000000005f20c9 in start_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:657

If the command to start the inferior is issued on the main UI, then
readline will have deprepped the terminal when we reach the above, and
the problem doesn't appear.

If however the command is issued on a non-main UI, then when we reach
that gdb_has_a_terminal call, the main UI's terminal state is still
set to whatever readline has sets it to in rl_prep_terminal, which
happens to have echo disabled.  Later, when the following synchronous
execution command finishes, we'll call target_terminal_ours to restore
gdb's the main UI's terminal settings, and that restores the terminal
state with echo disabled...

Conceptually, the fix is to move the gdb_has_a_terminal call earlier,
to someplace during GDB initialization, before readline/ncurses have
had a chance to change terminal settings.  Turns out that
"set_initial_gdb_ttystate" is exactly such a place.

I say conceptually, because the fix actually inlines the
gdb_has_a_terminal part that saves the terminal state in
set_initial_gdb_ttystate and then simplifies gdb_has_a_terminal, since
there's no point in making gdb_has_a_terminal do lazy computation.

gdb/ChangeLog:
2016-08-23  Pedro Alves  <palves@redhat.com>

	PR gdb/20494
	* inflow.c (our_terminal_info, initial_gdb_ttystate): Update
	comments.
	(enum gdb_has_a_terminal_flag_enum, gdb_has_a_terminal_flag):
	Delete.
	(set_initial_gdb_ttystate): Record our_terminal_info here too,
	instead of ...
	(gdb_has_a_terminal): ... here.  Reimplement in terms of
	initial_gdb_ttystate.  Make static.
	* terminal.h (gdb_has_a_terminal): Delete declaration.
	(set_initial_gdb_ttystate): Add comment.
	* top.c (show_interactive_mode): Use input_interactive_p instead
	of gdb_has_a_terminal.

gdb/testsuite/ChangeLog:
2016-08-23  Pedro Alves  <palves@redhat.com>

	PR gdb/20494
	* gdb.base/new-ui-echo.c: New file.
	* gdb.base/new-ui-echo.exp: New file.
tromey pushed a commit that referenced this issue Dec 2, 2016
Most of the time, the trace should be in one piece.  This case is handled fine
by GDB.  In some cases, however, there may be gaps in the trace.  They result
from trace decode errors or from overflows.

A gap in the trace means we lost an unknown amount of trace.  Gaps can be very
small, such as a few instructions in the same function, or they can be rather
big.  We may, for example, lose a few function calls or returns.  The trace may
continue in a different function and we likely don't know how we got there.

Even though we can't say how the program executed across a gap, higher levels
may not be impacted too much by it.  Let's assume we have functions a-e and a
trace that looks roughly like this:

  a
   \
    b                    b
     \                  /
      c   <gap>        c
                      /
                 d   d
                  \ /
                   e

Even though we can't say for sure, it is likely that b and c are the same
function instance before and after the gap.  This patch is trying to connect
the c and b function segments across the gap.

This will add a to the back trace of b on the right hand side.  The changes are
reflected in GDB's internal representation of the trace and will improve:

  - the output of "record function-call-history /c"
  - the output of "backtrace" in replay mode
  - source stepping in replay mode
    will be improved indirectly via the improved back trace

I don't have an automated test for this patch; decode errors will be fixed and
overflows occur sporadically and are quite rare.  I tested it by hacking GDB to
provoke a decode error and on the expected gap in the gdb.btrace/dlopen.exp
test.

The issue is that we can't predict where we will be able to re-sync in case of
errors.  For the expected decode error in gdb.btrace/dlopen.exp, for example, we
may be able to re-sync somewhere in dlclose, in test, in main, or not at all.

Here's one example run of gdb.btrace/dlopen.exp with and without this patch.

    (gdb) info record
    Active record target: record-btrace
    Recording format: Intel Processor Trace.
    Buffer size: 16kB.
    warning: Non-contiguous trace at instruction 66608 (offset = 0xa83, pc = 0xb7fdcc31).
    warning: Non-contiguous trace at instruction 66652 (offset = 0xa9b, pc = 0xb7fdcc31).
    warning: Non-contiguous trace at instruction 66770 (offset = 0xacb, pc = 0xb7fdcc31).
    warning: Non-contiguous trace at instruction 66966 (offset = 0xb60, pc = 0xb7ff5ee4).
    warning: Non-contiguous trace at instruction 66994 (offset = 0xb74, pc = 0xb7ff5f24).
    warning: Non-contiguous trace at instruction 67334 (offset = 0xbac, pc = 0xb7ff5e6d).
    warning: Non-contiguous trace at instruction 69022 (offset = 0xc04, pc = 0xb7ff60b3).
    warning: Non-contiguous trace at instruction 69116 (offset = 0xc1c, pc = 0xb7ff60b3).
    warning: Non-contiguous trace at instruction 69504 (offset = 0xc74, pc = 0xb7ff605d).
    warning: Non-contiguous trace at instruction 83648 (offset = 0xecc, pc = 0xb7ff6134).
    warning: Decode error (-13) at instruction 83876 (offset = 0xf48, pc = 0xb7fd6380): no memory mapped at this address.
    warning: Non-contiguous trace at instruction 83876 (offset = 0x11b7, pc = 0xb7ff1c70).
    Recorded 83948 instructions in 912 functions (12 gaps) for thread 1 (process 12996).
    (gdb) record instruction-history 83876, +2
    83876   => 0xb7fec46f <call_init.part.0+95>:    call   *%eax
    [decode error (-13): no memory mapped at this address]
    [disabled]
    83877      0xb7ff1c70 <_dl_close_worker.part.0+1584>:   nop

Without the patch, the trace is disconnected and the backtrace is short:

    (gdb) record goto 83876
    #0  0xb7fec46f in call_init.part () from /lib/ld-linux.so.2
    (gdb) backtrace
    #0  0xb7fec46f in call_init.part () from /lib/ld-linux.so.2
    #1  0xb7fec5d0 in _dl_init () from /lib/ld-linux.so.2
    #2  0xb7ff0fe3 in dl_open_worker () from /lib/ld-linux.so.2
    Backtrace stopped: not enough registers or memory available to unwind further
    (gdb) record goto 83877
    #0  0xb7ff1c70 in _dl_close_worker.part.0 () from /lib/ld-linux.so.2
    (gdb) backtrace
    #0  0xb7ff1c70 in _dl_close_worker.part.0 () from /lib/ld-linux.so.2
    #1  0xb7ff287a in _dl_close () from /lib/ld-linux.so.2
    #2  0xb7fc3d5d in dlclose_doit () from /lib/libdl.so.2
    #3  0xb7fec354 in _dl_catch_error () from /lib/ld-linux.so.2
    #4  0xb7fc43dd in _dlerror_run () from /lib/libdl.so.2
    #5  0xb7fc3d98 in dlclose () from /lib/libdl.so.2
    #6  0x0804860a in test ()
    #7  0x08048628 in main ()

With the patch, GDB is able to connect the trace pieces and we get a full
backtrace.

    (gdb) record goto 83876
    #0  0xb7fec46f in call_init.part () from /lib/ld-linux.so.2
    (gdb) backtrace
    #0  0xb7fec46f in call_init.part () from /lib/ld-linux.so.2
    #1  0xb7fec5d0 in _dl_init () from /lib/ld-linux.so.2
    #2  0xb7ff0fe3 in dl_open_worker () from /lib/ld-linux.so.2
    #3  0xb7fec354 in _dl_catch_error () from /lib/ld-linux.so.2
    #4  0xb7ff02e2 in _dl_open () from /lib/ld-linux.so.2
    #5  0xb7fc3c65 in dlopen_doit () from /lib/libdl.so.2
    #6  0xb7fec354 in _dl_catch_error () from /lib/ld-linux.so.2
    #7  0xb7fc43dd in _dlerror_run () from /lib/libdl.so.2
    #8  0xb7fc3d0e in dlopen@@GLIBC_2.1 () from /lib/libdl.so.2
    #9  0xb7ff28ee in _dl_runtime_resolve () from /lib/ld-linux.so.2
    #10 0x0804841c in ?? ()
    #11 0x08048470 in dlopen@plt ()
    #12 0x080485a3 in test ()
    #13 0x08048628 in main ()
    (gdb) record goto 83877
    #0  0xb7ff1c70 in _dl_close_worker.part.0 () from /lib/ld-linux.so.2
    (gdb) backtrace
    #0  0xb7ff1c70 in _dl_close_worker.part.0 () from /lib/ld-linux.so.2
    #1  0xb7ff287a in _dl_close () from /lib/ld-linux.so.2
    #2  0xb7fc3d5d in dlclose_doit () from /lib/libdl.so.2
    #3  0xb7fec354 in _dl_catch_error () from /lib/ld-linux.so.2
    #4  0xb7fc43dd in _dlerror_run () from /lib/libdl.so.2
    #5  0xb7fc3d98 in dlclose () from /lib/libdl.so.2
    #6  0x0804860a in test ()
    #7  0x08048628 in main ()

It worked nicely in this case but it may, of course, also lead to weird
connections; it is a heuristic, after all.

It works best when the gap is small and the trace pieces are long.

gdb/
	* btrace.c (bfun_s): New typedef.
	(ftrace_update_caller): Print caller in debug dump.
	(ftrace_get_caller, ftrace_match_backtrace, ftrace_fixup_level)
	(ftrace_compute_global_level_offset, ftrace_connect_bfun)
	(ftrace_connect_backtrace, ftrace_bridge_gap, btrace_bridge_gaps): New.
	(btrace_compute_ftrace_bts): Pass vector of gaps.  Collect gaps.
	(btrace_compute_ftrace_pt): Likewise.
	(btrace_compute_ftrace): Split into this, ...
	(btrace_compute_ftrace_1): ... this, and ...
	(btrace_finalize_ftrace): ... this.  Call btrace_bridge_gaps.
tromey pushed a commit that referenced this issue Mar 7, 2017
I build GDB for all targets enabled.  When I "set architecture rl78",
GDB crashes,

(gdb) set architecture rl78

Program received signal SIGSEGV, Segmentation fault.
append_flags_type_flag (type=0x20cc0e0, bitpos=bitpos@entry=0, name=name@entry=0x11dba3f "CY") at ../../binutils-gdb/gdb/gdbtypes.c:4926
4926				   name);
(gdb) bt 10
 #0  append_flags_type_flag (type=0x20cc0e0, bitpos=bitpos@entry=0, name=name@entry=0x11dba3f "CY") at ../../binutils-gdb/gdb/gdbtypes.c:4926
 #1  0x00000000004aaca8 in rl78_gdbarch_init (info=..., arches=<optimized out>) at ../../binutils-gdb/gdb/rl78-tdep.c:1410
 #2  0x00000000006b05a4 in gdbarch_find_by_info (info=...) at ../../binutils-gdb/gdb/gdbarch.c:5269
 #3  0x000000000060eee4 in gdbarch_update_p (info=...) at ../../binutils-gdb/gdb/arch-utils.c:557
 #4  0x000000000060f8a8 in set_architecture (ignore_args=<optimized out>, from_tty=1, c=<optimized out>) at ../../binutils-gdb/gdb/arch-utils.c:531
 #5  0x0000000000593d0b in do_set_command (arg=<optimized out>, arg@entry=0x20be851 "rl78", from_tty=from_tty@entry=1, c=c@entry=0x20b1540)
    at ../../binutils-gdb/gdb/cli/cli-setshow.c:455
 #6  0x00000000007665c3 in execute_command (p=<optimized out>, p@entry=0x20be840 "set architecture rl78", from_tty=1) at ../../binutils-gdb/gdb/top.c:666
 #7  0x00000000006935f4 in command_handler (command=0x20be840 "set architecture rl78") at ../../binutils-gdb/gdb/event-top.c:577
 #8  0x00000000006938d8 in command_line_handler (rl=<optimized out>) at ../../binutils-gdb/gdb/event-top.c:767
 #9  0x0000000000692c2c in gdb_rl_callback_handler (rl=0x20be890 "") at ../../binutils-gdb/gdb/event-top.c:200

The cause is that we want to access some builtin types in gdbarch init, but
it is not initialized yet.  I fix it by creating the type when it is to be
used.  We've already done this in sparc, sparc64 and m68k.

gdb:

2016-12-09  Yao Qi  <yao.qi@linaro.org>

	PR tdep/20953
	* rl78-tdep.c (rl78_psw_type): New function.
	(rl78_register_type): Call rl78_psw_type.
	(rl78_gdbarch_init): Move code to rl78_psw_type.

gdb/testsuite:

2016-12-09  Yao Qi  <yao.qi@linaro.org>

	* gdb.base/all-architectures.exp.in: Remove kfail for rl78.
tromey pushed a commit that referenced this issue Mar 7, 2017
I build GDB with all targets enabled, and "set architecture rx",
GDB crashes,

(gdb) set architecture rx

Program received signal SIGSEGV, Segmentation fault.
append_flags_type_flag (type=0x20cc360, bitpos=bitpos@entry=0, name=name@entry=0xd27529 "C") at ../../binutils-gdb/gdb/gdbtypes.c:4926
4926				   name);
(gdb) bt 10
 #0  append_flags_type_flag (type=0x20cc360, bitpos=bitpos@entry=0, name=name@entry=0xd27529 "C") at ../../binutils-gdb/gdb/gdbtypes.c:4926
 #1  0x00000000004ce725 in rx_gdbarch_init (info=..., arches=<optimized out>) at ../../binutils-gdb/gdb/rx-tdep.c:1051
 #2  0x00000000006b05a4 in gdbarch_find_by_info (info=...) at ../../binutils-gdb/gdb/gdbarch.c:5269
 #3  0x000000000060eee4 in gdbarch_update_p (info=...) at ../../binutils-gdb/gdb/arch-utils.c:557
 #4  0x000000000060f8a8 in set_architecture (ignore_args=<optimized out>, from_tty=1, c=<optimized out>) at ../../binutils-gdb/gdb/arch-utils.c:531
 #5  0x0000000000593d0b in do_set_command (arg=<optimized out>, arg@entry=0x20bee81 "rx ", from_tty=from_tty@entry=1, c=c@entry=0x20b1540)
    at ../../binutils-gdb/gdb/cli/cli-setshow.c:455
 #6  0x00000000007665c3 in execute_command (p=<optimized out>, p@entry=0x20bee70 "set architecture rx ", from_tty=1) at ../../binutils-gdb/gdb/top.c:666
 #7  0x00000000006935f4 in command_handler (command=0x20bee70 "set architecture rx ") at ../../binutils-gdb/gdb/event-top.c:577
 #8  0x00000000006938d8 in command_line_handler (rl=<optimized out>) at ../../binutils-gdb/gdb/event-top.c:767
 #9  0x0000000000692c2c in gdb_rl_callback_handler (rl=0x20be7f0 "") at ../../binutils-gdb/gdb/event-top.c:200

The cause is that we want to access some builtin types in gdbarch init, but
it is not initialized yet.  I fix it by creating the type when it is to be
used.  We've already done this in sparc, sparc64 and m68k.

gdb:

2016-12-09  Yao Qi  <yao.qi@linaro.org>

	PR tdep/20954
	* rx-tdep.c (rx_psw_type): New function.
	(rx_fpsw_type): New function.
	(rx_register_type): Call rx_psw_type and rx_fpsw_type.
	(rx_gdbarch_init): Move code to rx_psw_type and
	rx_fpsw_type.

gdb/testsuite:

2016-12-09  Yao Qi  <yao.qi@linaro.org>

	* gdb.base/all-architectures.exp.in: Remove kfail for "rx".
tromey pushed a commit that referenced this issue Mar 7, 2017
Nowadays, GDB propagates C++ exceptions across readline using
setjmp/longjmp 8952576 ("Propagate GDB/C++ exceptions across
readline using sj/lj-based TRY/CATCH") because DWARF-based unwinding
can't cross C functions compiled without -fexceptions (see details
from the commit above).

Unfortunately, toolchains that use SjLj-based C++ exceptions got
broken with that fix, because _Unwind_SjLj_Unregister, which is put at
the exit of a function, is not executed due to the longjmp added by
that commit.

 (gdb) [New Thread 2936.0xb80]
 kill

 Thread 1 received signal SIGSEGV, Segmentation fault.
 0x03ff662b in ?? ()
 top?bt 15
 #0  0x03ff662b in ?? ()
 #1  0x00526b92 in stdin_event_handler (error=0, client_data=0x172ed8)
    at ../../binutils-gdb/gdb/event-top.c:555
 #2  0x00525a94 in handle_file_event (ready_mask=<optimized out>,
    file_ptr=0x3ff5cb8) at ../../binutils-gdb/gdb/event-loop.c:733
 #3  gdb_wait_for_event (block=block@entry=1)
    at ../../binutils-gdb/gdb/event-loop.c:884
 #4  0x00525bfb in gdb_do_one_event ()
    at ../../binutils-gdb/gdb/event-loop.c:347
 #5  0x00525ce5 in start_event_loop ()
    at ../../binutils-gdb/gdb/event-loop.c:371
 #6  0x0051fada in captured_command_loop (data=0x0)
    at ../../binutils-gdb/gdb/main.c:324
 #7  0x0051cf5d in catch_errors (
    func=func@entry=0x51fab0 <captured_command_loop(void*)>,
    func_args=func_args@entry=0x0,
    errstring=errstring@entry=0x7922bf <VEC_interp_factory_p_quick_push(VEC_inte rp_factory_p*, interp_factory*, char const*, unsigned int)::__PRETTY_FUNCTION__+351> "", mask=mask@entry=RETURN_MASK_ALL)
    at ../../binutils-gdb/gdb/exceptions.c:236
 #8  0x00520f0c in captured_main (data=0x328feb4)
    at ../../binutils-gdb/gdb/main.c:1149
 #9  gdb_main (args=args@entry=0x328feb4) at ../../binutils-gdb/gdb/main.c:1159
 #10 0x0071e400 in main (argc=1, argv=0x171220)
    at ../../binutils-gdb/gdb/gdb.c:32

Fix this by making the functions involved in setjmp/longjmp as
noexcept, so that the compiler knows it doesn't need to emit the
_Unwind_SjLj_Register / _Unwind_SjLj_Unregister calls for C++
exceptions.

Tested on x86_64 Fedora 23 with:
 - GCC 5.3.1 w/ DWARF-based exceptions.
 - GCC 7 built with --enable-sjlj-exceptions.

gdb/ChangeLog:
2016-12-20  Pedro Alves  <palves@redhat.com>
	    Yao Qi  <yao.qi@linaro.org>

	PR gdb/20977
	* event-top.c (gdb_rl_callback_read_char_wrapper_noexcept): New
	noexcept function, factored out from ...
	(gdb_rl_callback_read_char_wrapper): ... this.
	(gdb_rl_callback_handler): Mark noexcept.
tromey pushed a commit that referenced this issue Mar 7, 2017
New in v2:

 - Define PyMem_RawMalloc as PyMem_Malloc for Python < 3.4 and use
   PyMem_RawMalloc in the code.

Since Python 3.4, the callback installed in PyOS_ReadlineFunctionPointer
should return a value allocated with PyMem_RawMalloc instead of
PyMem_Malloc.  The reason is that PyMem_Malloc must be called with the
Python Global Interpreter Lock (GIL) held, which is not the case in the
context where this function is called.  PyMem_RawMalloc was introduced
for cases like this.

In Python 3.6, it looks like they added an assert to verify that
PyMem_Malloc was not called without the GIL.  The consequence is that
typing anything in the python-interactive mode of gdb crashes the
process.  The same behavior was observed with the official package on
Arch Linux as well as with a manual Python build on Ubuntu 14.04.

This is what is shown with a debug build of Python 3.6 (the error with a
non-debug build is far less clear):

  (gdb) pi
  >>> print(1)
  Fatal Python error: Python memory allocator called without holding the GIL

  Current thread 0x00007f1459af8780 (most recent call first):
  [1]    21326 abort      ./gdb

and the backtrace:

  #0  0x00007ffff618bc37 in raise () from /lib/x86_64-linux-gnu/libc.so.6
  #1  0x00007ffff618f028 in abort () from /lib/x86_64-linux-gnu/libc.so.6
  #2  0x00007ffff6b104d6 in Py_FatalError (msg=msg@entry=0x7ffff6ba15b8 "Python memory allocator called without holding the GIL") at Python/pylifecycle.c:1457
  #3  0x00007ffff6a37a68 in _PyMem_DebugCheckGIL () at Objects/obmalloc.c:1972
  #4  0x00007ffff6a3804e in _PyMem_DebugFree (ctx=0x7ffff6e65290 <_PyMem_Debug+48>, ptr=0x24f8830) at Objects/obmalloc.c:1994
  #5  0x00007ffff6a38e1d in PyMem_Free (ptr=<optimized out>) at Objects/obmalloc.c:442
  #6  0x00007ffff6b866c6 in _PyFaulthandler_Fini () at ./Modules/faulthandler.c:1369
  #7  0x00007ffff6b104bd in Py_FatalError (msg=msg@entry=0x7ffff6ba15b8 "Python memory allocator called without holding the GIL") at Python/pylifecycle.c:1431
  #8  0x00007ffff6a37a68 in _PyMem_DebugCheckGIL () at Objects/obmalloc.c:1972
  #9  0x00007ffff6a37aa3 in _PyMem_DebugMalloc (ctx=0x7ffff6e65290 <_PyMem_Debug+48>, nbytes=5) at Objects/obmalloc.c:1980
  #10 0x00007ffff6a38d91 in PyMem_Malloc (size=<optimized out>) at Objects/obmalloc.c:418
  #11 0x000000000064dbe2 in gdbpy_readline_wrapper (sys_stdin=0x7ffff6514640 <_IO_2_1_stdin_>, sys_stdout=0x7ffff6514400 <_IO_2_1_stdout_>, prompt=0x7ffff4d4f7d0 ">>> ")
    at /home/emaisin/src/binutils-gdb/gdb/python/py-gdb-readline.c:75

The documentation is very clear about it [1] and it was also mentioned
in the "What's New In Python 3.4" page [2].

[1] https://docs.python.org/3/c-api/veryhigh.html#c.PyOS_ReadlineFunctionPointer
[2] https://docs.python.org/3/whatsnew/3.4.html#changes-in-the-c-api

gdb/ChangeLog:

	* python/python-internal.h (PyMem_RawMalloc): Define for
	Python < 3.4.
	* python/py-gdb-readline.c (gdbpy_readline_wrapper): Use
	PyMem_RawMalloc instead of PyMem_Malloc.
tromey pushed a commit that referenced this issue Mar 7, 2017
When the gdbpy_ref objects get destroyed, they call Py_DECREF to
decrement the reference counter of the python object they hold a
reference to.  Any time we call into the Python API, we should be
holding the GIL.  The gdbpy_enter object does that for us in an
RAII-fashion.

However, if gdbpy_enter is declared after a gdbpy_ref object in a
function, gdbpy_enter's destructor will be called (and the GIL will be
released) before gdbpy_ref's destructor is called.  Therefore, we will
end up calling Py_DECREF without holding the GIL.

This became obvious with Python 3.6, where memory management functions
have asserts to make sure that the GIL is held.  This was exposed by
tests py-as-string.exp, py-function.exp and py-xmethods.  For example:

  (gdb) p $_as_string(enum_valid)
  Fatal Python error: Python memory allocator called without holding the GIL

  Current thread 0x00007f7f7b21c780 (most recent call first):
  [1]    18678 abort (core dumped)  ./gdb -nx testsuite/outputs/gdb.python/py-as-string/py-as-string

  #0  0x00007ffff618bc37 in raise () from /lib/x86_64-linux-gnu/libc.so.6
  #1  0x00007ffff618f028 in abort () from /lib/x86_64-linux-gnu/libc.so.6
  #2  0x00007ffff6b104d6 in Py_FatalError (msg=msg@entry=0x7ffff6ba15b8 "Python memory allocator called without holding the GIL") at Python/pylifecycle.c:1457
  #3  0x00007ffff6a37a68 in _PyMem_DebugCheckGIL () at Objects/obmalloc.c:1972
  #4  0x00007ffff6a3804e in _PyMem_DebugFree (ctx=0x7ffff6e65290 <_PyMem_Debug+48>, ptr=0x24f8830) at Objects/obmalloc.c:1994
  #5  0x00007ffff6a38e1d in PyMem_Free (ptr=<optimized out>) at Objects/obmalloc.c:442
  #6  0x00007ffff6b866c6 in _PyFaulthandler_Fini () at ./Modules/faulthandler.c:1369
  #7  0x00007ffff6b104bd in Py_FatalError (msg=msg@entry=0x7ffff6ba15b8 "Python memory allocator called without holding the GIL") at Python/pylifecycle.c:1431
  #8  0x00007ffff6a37a68 in _PyMem_DebugCheckGIL () at Objects/obmalloc.c:1972
  #9  0x00007ffff6a3804e in _PyMem_DebugFree (ctx=0x7ffff6e652c0 <_PyMem_Debug+96>, ptr=0x7ffff46b6040) at Objects/obmalloc.c:1994
  #10 0x00007ffff6a38f55 in PyObject_Free (ptr=<optimized out>) at Objects/obmalloc.c:503
  #11 0x00007ffff6a5f27e in unicode_dealloc (unicode=unicode@entry=0x7ffff46b6040) at Objects/unicodeobject.c:1794
  #12 0x00007ffff6a352a9 in _Py_Dealloc (op=0x7ffff46b6040) at Objects/object.c:1786
  #13 0x000000000063f28b in gdb_Py_DECREF (op=0x7ffff46b6040) at /home/emaisin/src/binutils-gdb/gdb/python/python-internal.h:192
  #14 0x000000000063fa33 in gdbpy_ref_policy::decref (ptr=0x7ffff46b6040) at /home/emaisin/src/binutils-gdb/gdb/python/py-ref.h:35
  #15 0x000000000063fa77 in gdb::ref_ptr<_object, gdbpy_ref_policy>::~ref_ptr (this=0x7fffffffcdf0, __in_chrg=<optimized out>) at /home/emaisin/src/binutils-gdb/gdb/common/gdb_ref_ptr.h:91
  #16 0x000000000064d8b8 in fnpy_call (gdbarch=0x2b50010, language=0x115d2c0 <c_language_defn>, cookie=0x7ffff46b7468, argc=1, argv=0x7fffffffcf48)
    at /home/emaisin/src/binutils-gdb/gdb/python/py-function.c:145

The fix is to place the gdbpy_enter first in the function.  I also
cleaned up the comments a bit and removed the unnecessary initialization
of the value variable.

gdb/ChangeLog:

	* python/py-function.c (fnpy_call): Reorder declarations to have
	the gdbpy_enter object declared first.
	* python/py-xmethods.c (gdbpy_get_xmethod_arg_types): Likewise.
tromey pushed a commit that referenced this issue Mar 7, 2017
This is a follow-up to

  https://sourceware.org/ml/gdb-patches/2017-02/msg00261.html

This patch restricts queries to the main UI, which allows to avoid two
different problems.

The first one is that GDB is issuing queries on secondary MI channels
for which a TTY is allocated.  The second one is that GDB is not able to
handle queries on two (CLI) UIs simultaneously.  Restricting queries to
the main UI allows to bypass these two problems.

More details on how/why these two problems happen:

1. Queries on secondary MI UI

  The current criterion to decide if we should query the user is whether
  the input stream is a TTY.  The original way to start GDB in MI mode
  from a front-end was to create a subprocess with pipes to its
  stdin/stdout.  In this case, the input was considered non-interactive
  and queries were auto-answered.  Now that front-ends can create the MI
  channel as a separate UI connected to a dedicated TTY, GDB now
  considers this input stream as interactive and sends queries to it.
  By restricting queries to the main UI, we make sure we never query on
  the secondary MI UI.

2. Simultaneous queries

  As Pedro stated it, when you have two queries on two different CLI UIs
  at the same time, you end up with the following pseudo stack:

  #0 gdb_readline_wrapper
  #1 defaulted_query                 // for UI #2
  #2 handle_command
  #3 execute_command ("handle SIGTRAP" ....
  #4 stdin_event_handler             // input on UI #2
  #5 gdb_do_one_event
  #7 gdb_readline_wrapper
  #8 defaulted_query                 // for UI #1
  #9 handle_command
  #10 execute_command ("handle SIGINT" ....
  #11 stdin_event_handler            // input on UI #1
  #12 gdb_do_one_event
  #13 gdb_readline_wrapper

  trying to answer the query on UI #1 will therefore answer for UI #2.

  By restricting the queries to the main UI, we ensure that there will
  never be more than one pending query, since you can't have two queries
  on a UI at the same time.

I added a snippet to gdb.base/new-ui.exp to verify that we get a query
on the main UI, but that we don't on the secondary one (or, more
precisely, that it gets auto-answered).

gdb/ChangeLog:

	* utils.c (defaulted_query): Don't query on secondary UIs.

gdb/testsuite/ChangeLog:

	* gdb.base/new-ui.exp (do_test): Test queries behavior on main
	and extra UIs.
tromey pushed a commit that referenced this issue Mar 7, 2017
Commit d7e7473 ("Eliminate make_cleanup_ui_file_delete / make
ui_file a class hierarchy") introduced a problem when using "layout
regs", that leads gdb to crash when issuing:

./gdb ./a.out -ex 'layout regs' -ex start

From the backtrace, it's caused by this 'delete' on tui_restore_gdbout():

 (gdb) bt
 #0  0x00007ffff6b962b2 in free () from /lib64/libc.so.6
 #1  0x000000000059fa47 in tui_restore_gdbout (ui=0x22997b0) at ../../gdb/tui/tui-regs.c:714
 #2  0x0000000000619996 in do_my_cleanups (pmy_chain=pmy_chain@entry=0x1e08320 <cleanup_chain>, old_chain=old_chain@entry=0x235b4b0) at ../../gdb/common/cleanups.c:154
 #3  0x0000000000619b1d in do_cleanups (old_chain=old_chain@entry=0x235b4b0) at ../../gdb/common/cleanups.c:176
 #4  0x000000000059fb0d in tui_register_format (frame=frame@entry=0x22564e0, regnum=regnum@entry=0) at ../../gdb/tui/tui-regs.c:747
 #5  0x000000000059ffeb in tui_get_register (data=0x2434d18, changedp=0x0, regnum=0, frame=0x22564e0) at ../../gdb/tui/tui-regs.c:768
 #6  tui_show_register_group (refresh_values_only=<optimized out>, frame=0x22564e0, group=0x1e09250 <general_group>) at ../../gdb/tui/tui-regs.c:287
 #7  tui_show_registers (group=0x1e09250 <general_group>) at ../../gdb/tui/tui-regs.c:156
 #8  0x00000000005a07cf in tui_check_register_values (frame=frame@entry=0x22564e0) at ../../gdb/tui/tui-regs.c:496
 #9  0x00000000005a3e65 in tui_check_data_values (frame=frame@entry=0x22564e0) at ../../gdb/tui/tui-windata.c:232
 #10 0x000000000059cf65 in tui_refresh_frame_and_register_information (registers_too_p=1) at ../../gdb/tui/tui-hooks.c:156
 #11 0x00000000006d5c05 in generic_observer_notify (args=0x7fffffffdbe0, subject=<optimized out>) at ../../gdb/observer.c:167
 #12 observer_notify_normal_stop (bs=<optimized out>, print_frame=print_frame@entry=1) at ./observer.inc:61
 #13 0x00000000006a6409 in normal_stop () at ../../gdb/infrun.c:8364
 #14 0x00000000006af8f5 in fetch_inferior_event (client_data=<optimized out>) at ../../gdb/infrun.c:3990
 #15 0x000000000066f0fd in gdb_wait_for_event (block=block@entry=0) at ../../gdb/event-loop.c:859
 #16 0x000000000066f237 in gdb_do_one_event () at ../../gdb/event-loop.c:322
 #17 0x000000000066f386 in gdb_do_one_event () at ../../gdb/event-loop.c:353
 #18 0x00000000007411bc in wait_sync_command_done () at ../../gdb/top.c:570
 #19 0x0000000000741426 in maybe_wait_sync_command_done (was_sync=0) at ../../gdb/top.c:587
 #20 execute_command (p=<optimized out>, p@entry=0x7fffffffe43a "start", from_tty=from_tty@entry=1) at ../../gdb/top.c:676
 #21 0x00000000006c2048 in catch_command_errors (command=0x741200 <execute_command(char*, int)>, arg=0x7fffffffe43a "start", from_tty=1) at ../../gdb/main.c:376
 #22 0x00000000006c2b60 in captured_main_1 (context=0x7fffffffde70) at ../../gdb/main.c:1119
 #23 captured_main (data=0x7fffffffde70) at ../../gdb/main.c:1140
 #24 gdb_main (args=args@entry=0x7fffffffdf90) at ../../gdb/main.c:1158
 #25 0x0000000000408cf5 in main (argc=<optimized out>, argv=<optimized out>) at ../../gdb/gdb.c:32
 (gdb) f 1
 #1  0x000000000059fa47 in tui_restore_gdbout (ui=0x22997b0) at ../../gdb/tui/tui-regs.c:714
 714	  delete gdb_stdout;

The problem is simply that the commit mentioned above made the ui_file
that gdb_stdout is temporarily set to be a stack-allocated
string_file, while before it used to be a heap-allocated ui_file.  The
fix is simply to remove the now-incorrect delete.

New test included, which exercises enabling all TUI layouts, with and
without execution.  (This particular crash only triggers with
execution.)

gdb/ChangeLog:
2017-03-07  Pedro Alves  <palves@redhat.com>

	* tui/tui-regs.c (tui_restore_gdbout): Don't delete gdb_stdout.

gdb/testsuite/ChangeLog:
2017-03-07  Pedro Alves  <palves@redhat.com>

	* gdb.base/tui-layout.c: New file.
	* gdb.base/tui-layout.exp: New file.
tromey pushed a commit that referenced this issue Jul 11, 2017
Starting a process on macOS/Darwin currently leads to this error:

/Users/simark/src/binutils-gdb/gdb/darwin-nat.c:383: internal-error: void darwin_check_new_threads(struct inferior *): Assertion `tp' failed.

with the corresponding partial backtrace (sorry, taken with lldb,
because well, gdb is broken :)):

    frame #9: 0x000000010004605a gdb`darwin_check_new_threads(inf=0x0000000100edf670) at darwin-nat.c:383
    frame #10: 0x0000000100045848 gdb`darwin_init_thread_list(inf=0x0000000100edf670) at darwin-nat.c:1710
    frame #11: 0x00000001000452f8 gdb`darwin_ptrace_him(pid=8375) at darwin-nat.c:1792
    frame #12: 0x0000000100041d95 gdb`fork_inferior(...) at fork-inferior.c:440
    frame #13: 0x0000000100043f82 gdb`darwin_create_inferior(...) at darwin-nat.c:1841
    frame #14: 0x000000010034ac32 gdb`run_command_1(args=0x0000000000000000, from_tty=1, tbreak_at_main=1) at infcmd.c:611

The issue was introduced by commit

  "Share fork_inferior et al with gdbserver"

because it changed the place where the dummy thread (pid, 0, 0) is added,
relative to the call to the init_trace_fun callback.  In this callback, darwin
checks for new threads in the program (there should be exactly one) in order to
update this dummy thread with the right tid.  Previously, things happened in
this order:

 - fork_inferior calls fork()
 - fork_inferior adds dummy thread
 - fork_inferior calls init_trace_fun callback, which updates the dummy
   thread info

Following the commit mentioned above, the new thread is added in the
darwin-nat code, after having called fork_inferior (in
darwin_create_inferior).  So gdb tries to do things in this order:

 - fork_inferior calls fork()
 - fork_inferior calls init_trace_fun callback, which tries to update
   the dummy thread info
 - darwin_create_inferior adds the dummy thread

The error happens while trying to update the dummy thread that has not
been added yet.

I don't think this dummy thread is necessary for darwin.  Previously, it
was fork_inferior that was adding this thread, for all targets, so
darwin had to deal with it.  Now that it's done by targets themselves,
we can just skip that on darwin.  darwin_check_new_threads called
indirectly by init_trace_fun/darwin_ptrace_him will simply notice the
new thread and add it with the right information.

My level of testing was: try to start a process and try to attach to a
process, and it seems to work somewhat like it did before.  I tried to
run the testsuite, but it leaves a huge amount of zombie processes that
launchd doesn't seem to reap, leading to exhaustion of system resources
(number of processes).

gdb/ChangeLog:

	* darwin-nat.c (darwin_check_new_threads): Don't handle dummy
	thread.
	(darwin_init_thread_list): Don't update dummy thread.
	(darwin_create_inferior, darwin_attach): Don't add a dummy thread.
tromey pushed a commit that referenced this issue Jul 29, 2017
Ref: https://sourceware.org/ml/gdb-patches/2017-07/msg00162.html

Debugging x86-64 GNU/Linux programs currently crashes GDB in
tdesc_use_registers during gdbarch initialization:

  Program received signal SIGSEGV, Segmentation fault.
  0x0000000001093eaf in htab_remove_elt_with_hash (htab=0x2ef9fa0, element=0x26af960, hash=557151073) at src/libiberty/hashtab.c:728
  728       if (*slot == HTAB_EMPTY_ENTRY)
  (top-gdb) p slot
  $1 = (void **) 0x0
  (top-gdb) bt
  #0  0x0000000001093eaf in htab_remove_elt_with_hash (htab=0x2ef9fa0, element=0x26af960, hash=557151073) at src/libiberty/hashtab.c:728
  #1  0x0000000001093e79 in htab_remove_elt (htab=0x2ef9fa0, element=0x26af960) at src/libiberty/hashtab.c:714
  #2  0x00000000009121b0 in tdesc_use_registers (gdbarch=0x3001240, target_desc=0x2659cb0, early_data=0x2881cb0)
      at src/gdb/target-descriptions.c:1328
  #3  0x000000000047c93e in i386_gdbarch_init (info=..., arches=0x0) at src/gdb/i386-tdep.c:8634
  #4  0x0000000000818d5f in gdbarch_find_by_info (info=...) at src/gdb/gdbarch.c:5394
  #5  0x00000000007198a8 in set_gdbarch_from_file (abfd=0x2f48250) at src/gdb/arch-utils.c:618
  #6  0x00000000007f21cb in exec_file_attach (filename=0x7fffffffddb0 "/home/pedro/gdb/tests/threads", from_tty=1) at src/gdb/exec.c:380
  #7  0x0000000000865c18 in catch_command_errors_const (command=0x7f1d83 <exec_file_attach(char const*, int)>, arg=0x7fffffffddb0 "/home/pedro/gdb/tests/threads",
      from_tty=1) at src/gdb/main.c:403
  #8  0x00000000008669cf in captured_main_1 (context=0x7fffffffd860) at src/gdb/main.c:1035
  #9  0x0000000000866de2 in captured_main (data=0x7fffffffd860) at src/gdb/main.c:1142
  #10 0x0000000000866e24 in gdb_main (args=0x7fffffffd860) at src/gdb/main.c:1160
  #11 0x000000000041312d in main (argc=3, argv=0x7fffffffd968) at src/gdb/gdb.c:32

The direct cause of the crash is that we tried to remove an element
from the hash which supposedly exists, but does not.  (htab_remove_elt
shouldn't really crash in this case, but that's secondary.)

The real problem is that early_data passed to tdesc_use_registers
includes regs from a target description that is not the target_desc,
which violates its assumptions.  The registers in question are the
fs_base/gs_base registers, added by amd64_init_abi:

      tdesc_numbered_register (feature, tdesc_data_segments,
		       AMD64_FSBASE_REGNUM, "fs_base");
      tdesc_numbered_register (feature, tdesc_data_segments,
		       AMD64_GSBASE_REGNUM, "gs_base");

and that happens because amd64_linux_init_abi uses amd64_init_abi as
helper, but they don't coordinate on which fallback tdesc to use.

amd64_init_abi does:

  if (! tdesc_has_registers (tdesc))
    tdesc = tdesc_amd64;

and then adds the fs_base/gs_base registers of the "tdesc_amd64" tdesc
to the tdesc_arch_data.

After amd64_init_abi returns, amd64_linux_init_abi does:

  if (! tdesc_has_registers (tdesc))
    tdesc = tdesc_amd64_linux;
  tdep->tdesc = tdesc;

and we end up tdesc_amd64_linux installed in tdep->tdesc.

The fix is to make sure that amd64_linux_init_abi and amd64_init_abi
agree on default tdesc, by adding a "default tdesc" parameter to
amd64_init_abi, instead of having amd64_init_abi hardcode a default.
With this, amd64_init_abi creates the fs_base/gs_base registers using
the tdesc_amd64_linux tdesc.

Tested on x86-64 GNU/Linux, -m64.  I don't have an x32 setup handy.

Thanks to John Baldwin, Yao Qi and Simon Marchi for the investigation.

gdb/ChangeLog:
2017-07-13  Pedro Alves  <palves@redhat.com>

	* amd64-darwin-tdep.c (x86_darwin_init_abi_64): Pass tdesc_amd64
	as default tdesc.
	* amd64-dicos-tdep.c (amd64_dicos_init_abi):
	* amd64-fbsd-tdep.c (amd64fbsd_init_abi):
	* amd64-linux-tdep.c (amd64_linux_init_abi): Pass
	tdesc_amd64_linux as default tdesc.  Get final tdesc from the
	tdep.
	(amd64_x32_linux_init_abi): Pass tdesc_x32_linux as default tdesc.
	Get final tdesc from the tdep.
	* amd64-nbsd-tdep.c (amd64nbsd_init_abi): Pass tdesc_amd64 as
	default tdesc.
	* amd64-obsd-tdep.c (amd64obsd_init_abi): Likewise.
	* amd64-sol2-tdep.c (amd64_sol2_init_abi): Likewise.
	* amd64-tdep.c (amd64_init_abi): Add 'default_tdesc' parameter.
	Use it as default tdesc.
	(amd64_x32_init_abi): Add 'default_tdesc' parameter, and pass it
	down to amd_init_abi.  No longer handle fallback tdesc here.
	* amd64-tdep.h (tdesc_x32): Declare.
	(amd64_init_abi, amd64_x32_init_abi): Add 'default_tdesc'
	parameter.
	* amd64-windows-tdep.c (amd64_windows_init_abi): Pass tdesc_amd64
	as default tdesc.
tromey pushed a commit that referenced this issue Sep 13, 2017
PR 21555 is caused by the exception during the prologue analysis when re-set
a breakpoint.

(gdb) bt
 #0  memory_error_message (err=TARGET_XFER_E_IO, gdbarch=0x153db50, memaddr=93824992233232) at ../../binutils-gdb/gdb/corefile.c:192
 #1  0x00000000005718ed in memory_error (err=TARGET_XFER_E_IO, memaddr=memaddr@entry=93824992233232) at ../../binutils-gdb/gdb/corefile.c:220
 #2  0x00000000005719d6 in read_memory_object (object=object@entry=TARGET_OBJECT_CODE_MEMORY, memaddr=93824992233232, memaddr@entry=1, myaddr=myaddr@entry=0x7fffffffd0a0 "P\333S\001", len=len@entry=1) at ../../binutils-gdb/gdb/corefile.c:259
 #3  0x0000000000571c6e in read_code (len=1, myaddr=0x7fffffffd0a0 "P\333S\001", memaddr=<optimized out>) at ../../binutils-gdb/gdb/corefile.c:287
 #4  read_code_unsigned_integer (memaddr=memaddr@entry=93824992233232, len=len@entry=1, byte_order=byte_order@entry=BFD_ENDIAN_LITTLE)                          at ../../binutils-gdb/gdb/corefile.c:362
 #5  0x000000000041d4a0 in amd64_analyze_prologue (gdbarch=gdbarch@entry=0x153db50, pc=pc@entry=93824992233232, current_pc=current_pc@entry=18446744073709551615, cache=cache@entry=0x7fffffffd1e0) at ../../binutils-gdb/gdb/amd64-tdep.c:2310
 #6  0x000000000041e404 in amd64_skip_prologue (gdbarch=0x153db50, start_pc=93824992233232) at ../../binutils-gdb/gdb/amd64-tdep.c:2459
 #7  0x000000000067bfb0 in skip_prologue_sal (sal=sal@entry=0x7fffffffd4e0) at ../../binutils-gdb/gdb/symtab.c:3628
 #8  0x000000000067c4d8 in find_function_start_sal (sym=sym@entry=0x1549960, funfirstline=1) at ../../binutils-gdb/gdb/symtab.c:3501
 #9  0x000000000060999d in symbol_to_sal (result=result@entry=0x7fffffffd5f0, funfirstline=<optimized out>, sym=sym@entry=0x1549960) at ../../binutils-gdb/gdb/linespec.c:3860
....
 #16 0x000000000054b733 in location_to_sals (b=b@entry=0x15792d0, location=0x157c230, search_pspace=search_pspace@entry=0x1148120, found=found@entry=0x7fffffffdc64) at ../../binutils-gdb/gdb/breakpoint.c:14211
 #17 0x000000000054c1f5 in breakpoint_re_set_default (b=0x15792d0) at ../../binutils-gdb/gdb/breakpoint.c:14301
 #18 0x00000000005412a9 in breakpoint_re_set_one (bint=bint@entry=0x15792d0) at ../../binutils-gdb/gdb/breakpoint.c:14412

This problem can be fixed by

 - either each prologue analyzer doesn't throw exception,
 - or catch the exception thrown from gdbarch_skip_prologue,

I choose the latter because the former needs to fix *every* prologue
analyzer to not throw exception.

This error can be reproduced by changing reread.exp.  The test reread.exp
has already test that breakpoint can be reset correctly after the
executable is re-read.  This patch extends this test by compiling test c
file with and without -fPIE.

(gdb) run ^M
The program being debugged has been started already.^M
Start it from the beginning? (y or n) y^M
x86_64/gdb/testsuite/outputs/gdb.base/reread/reread' has changed; re-reading symbols.
Error in re-setting breakpoint 1: Cannot access memory at address 0x555555554790^M
Error in re-setting breakpoint 2: Cannot access memory at address 0x555555554790^M
Starting program: /scratch/yao/gdb/build-git/x86_64/gdb/testsuite/outputs/gdb.base/reread/reread ^M
This is foo^M
[Inferior 1 (process 27720) exited normally]^M
(gdb) FAIL: gdb.base/reread.exp: opts= "-fPIE" "ldflags=-pie" : run to foo() second time (the program exited)

This patch doesn't re-indent the code, to keep the patch simple.

gdb:

2017-07-25  Yao Qi  <yao.qi@linaro.org>

	PR gdb/21555
	* arch-utils.c (gdbarch_skip_prologue_noexcept): New function.
	* arch-utils.h (gdbarch_skip_prologue_noexcept): Declare.
	* infrun.c: Include arch-utils.h
	(handle_step_into_function): Call gdbarch_skip_prologue_noexcept.
	(handle_step_into_function_backward): Likewise.
	* symtab.c (skip_prologue_sal): Likewise.

gdb/testsuite:

2017-07-25  Yao Qi  <yao.qi@linaro.org>

	PR gdb/21555
	* gdb.base/reread.exp: Wrap the whole test with two kinds of
	compilation flags, with -fPIE and without -fPIE.
tromey pushed a commit that referenced this issue Dec 8, 2017
If you have a breakpoint command that re-resumes the target, like:

  break foo
  commands
  > c
  > end

and then let the inferior run, hitting the breakpoint, and then press
Ctrl-C at just the right time, between GDB processing the stop at
"foo", and re-resuming the target, you'll hit the QUIT call in
infrun.c:resume.

With this hack, we can reproduce the bad case consistently:

  --- a/gdb/inf-loop.c
  +++ b/gdb/inf-loop.c
  @@ -31,6 +31,8 @@
   #include "top.h"
   #include "observer.h"

  +bool continue_hack;
  +
   /* General function to handle events in the inferior.  */

   void
  @@ -64,6 +66,8 @@ inferior_event_handler (enum inferior_event_type event_type,
	  {
	    check_frame_language_change ();

  +         continue_hack = true;
  +
	    /* Don't propagate breakpoint commands errors.  Either we're
	       stopping or some command resumes the inferior.  The user will
	       be informed.  */
  diff --git a/gdb/infrun.c b/gdb/infrun.c
  index d425664..c74b14c 100644
  --- a/gdb/infrun.c
  +++ b/gdb/infrun.c
  @@ -2403,6 +2403,10 @@ resume (enum gdb_signal sig)
     gdb_assert (!tp->stop_requested);
     gdb_assert (!thread_is_in_step_over_chain (tp));

  +  extern bool continue_hack;
  +
  +  if (continue_hack)
  +    set_quit_flag ();
     QUIT;

The GDB backtrace looks like this:

  (top-gdb) bt
  ...
  #3  0x0000000000612e8b in throw_quit(char const*, ...) (fmt=0xaf84a1 "Quit") at src/gdb/common/common-exceptions.c:408
  #4  0x00000000007fc104 in quit() () at src/gdb/utils.c:748
  #5  0x00000000006a79d2 in default_quit_handler() () at src/gdb/event-top.c:954
  #6  0x00000000007fc134 in maybe_quit() () at src/gdb/utils.c:762
  #7  0x00000000006f66a3 in resume(gdb_signal) (sig=GDB_SIGNAL_0) at src/gdb/infrun.c:2406
  #8  0x0000000000700c3d in keep_going_pass_signal(execution_control_state*) (ecs=0x7ffcf3744e60) at src/gdb/infrun.c:7793
  #9  0x00000000006f5fcd in start_step_over() () at src/gdb/infrun.c:2145
  #10 0x00000000006f7b1f in proceed(unsigned long, gdb_signal) (addr=18446744073709551615, siggnal=GDB_SIGNAL_DEFAULT)
      at src/gdb/infrun.c:3135
  #11 0x00000000006ebdd4 in continue_1(int) (all_threads=0) at src/gdb/infcmd.c:842
  #12 0x00000000006ec097 in continue_command(char*, int) (args=0x0, from_tty=0) at src/gdb/infcmd.c:938
  #13 0x00000000004b5140 in do_cfunc(cmd_list_element*, char*, int) (c=0x2d18570, args=0x0, from_tty=0)
      at src/gdb/cli/cli-decode.c:106
  #14 0x00000000004b8219 in cmd_func(cmd_list_element*, char*, int) (cmd=0x2d18570, args=0x0, from_tty=0)
      at src/gdb/cli/cli-decode.c:1952
  #15 0x00000000007f1532 in execute_command(char*, int) (p=0x7ffcf37452b1 "", from_tty=0) at src/gdb/top.c:608
  #16 0x00000000004bd127 in execute_control_command(command_line*) (cmd=0x3a88ef0) at src/gdb/cli/cli-script.c:485
  #17 0x00000000005cae0c in bpstat_do_actions_1(bpstat*) (bsp=0x37edcf0) at src/gdb/breakpoint.c:4513
  #18 0x00000000005caf67 in bpstat_do_actions() () at src/gdb/breakpoint.c:4563
  #19 0x00000000006e8798 in inferior_event_handler(inferior_event_type, void*) (event_type=INF_EXEC_COMPLETE, client_data=0x0)
      at src/gdb/inf-loop.c:72
  #20 0x00000000006f9447 in fetch_inferior_event(void*) (client_data=0x0) at src/gdb/infrun.c:3970
  #21 0x00000000006e870e in inferior_event_handler(inferior_event_type, void*) (event_type=INF_REG_EVENT, client_data=0x0)
      at src/gdb/inf-loop.c:43
  #22 0x0000000000494d58 in remote_async_serial_handler(serial*, void*) (scb=0x3585ca0, context=0x2cd1b80)
      at src/gdb/remote.c:13820
  #23 0x000000000044d682 in run_async_handler_and_reschedule(serial*) (scb=0x3585ca0) at src/gdb/ser-base.c:137
  #24 0x000000000044d767 in fd_event(int, void*) (error=0, context=0x3585ca0) at src/gdb/ser-base.c:188
  #25 0x00000000006a5686 in handle_file_event(file_handler*, int) (file_ptr=0x45997d0, ready_mask=1)
      at src/gdb/event-loop.c:733
  #26 0x00000000006a5c29 in gdb_wait_for_event(int) (block=1) at src/gdb/event-loop.c:859
  #27 0x00000000006a4aa6 in gdb_do_one_event() () at src/gdb/event-loop.c:347
  #28 0x00000000006a4ade in start_event_loop() () at src/gdb/event-loop.c:371

and when that happens, you end up with GDB's run control in quite a
messed up state.  Something like this:

  thread_function1 (arg=0x1) at threads.c:107
  107             usleep (SLEEP);  /* Loop increment.  */
  Quit
  (gdb) c
  Continuing.
  ** nothing happens, time passes..., press ctrl-c again **
  ^CQuit
  (gdb) info threads
    Id   Target Id         Frame
    1    Thread 1462.1462 "threads" (running)
  * 2    Thread 1462.1466 "threads" (running)
    3    Thread 1462.1465 "function0" (running)
  (gdb) c
  Cannot execute this command while the selected thread is running.
  (gdb)

The first "Quit" above is thrown from within "resume", and cancels run
control while GDB is in the middle of stepping over a breakpoint.
with step_over_info_valid_p() true.  The next "c" didn't actually
resume anything, because GDB throught that the step-over was still in
progress.  It wasn't, because the thread that was supposed to be
stepping over the breakpoint wasn't actually resumed.

So at this point, we press Ctrl-C again, and this time, the default
quit handler is called directly from the event loop
(event-top.c:default_quit_handler -> quit()), because gdb was left
owning the terminal (because the previous resume was cancelled before
we reach target_resume -> target_terminal::inferior()).

Note that the exception called from within resume ends up calling
normal_stop via resume_cleanups.  That's very borked though, because
normal_stop is going to re-handle whatever was the last reported
event, possibly even re-running a hook stop...  I think that the only
sane way to safely cancel the run control state machinery is to push
an event via handle_inferior_event like all other events.

The fix here does two things, and either alone would fix the problem
at hand:

#1 - passes the terminal to the inferior earlier, so that any QUIT
     call from the point we declare the target as running goes to the
     inferior directly, protecting run control from unsafe QUIT calls.

#2 - gets rid of this QUIT call in resume and of its related unsafe
     resume_cleanups.

Aboout #2, the comment describing resume says:

  /* Resume the inferior, but allow a QUIT.  This is useful if the user
     wants to interrupt some lengthy single-stepping operation
     (for child processes, the SIGINT goes to the inferior, and so
     we get a SIGINT random_signal, but for remote debugging and perhaps
     other targets, that's not true).

but that's a really old comment that predates a lot of fixes to Ctrl-C
handling throughout both GDB core and the remote target, that made
sure that a Ctrl-C isn't ever lost.  In any case, if some target
depended on this, a much better fix would be to make the target return
a SIGINT stop out of target_wait the next time that is called.

This was exposed by the new gdb.base/bp-cmds-continue-ctrl-c.exp
testcase added later in the series.

gdb/ChangeLog:
2017-11-16  Pedro Alves  <palves@redhat.com>

	* infrun.c (resume_cleanups): Delete.
	(resume): No longer install a resume_cleanups cleanup nor call
	QUIT.
	(proceed): Pass the terminal to the inferior.
	(keep_going_pass_signal): No longer install a resume_cleanups
	cleanup.
tromey pushed a commit that referenced this issue Jan 17, 2018
At <https://sourceware.org/ml/gdb-patches/2017-12/msg00285.html>,
Maciej reported that commit:

  commit 5cd63fd
  Date: Wed Oct 4 18:21:10 2017 +0100
  Subject: Fix "Remote 'g' packet reply is too long" problems with multiple inferiors

made GDB stop working with older stubs.  Any attempt to continue
execution after the initial connection fails with:

  [...]
  Process .../gdb/testsuite/outputs/gdb.base/advance/advance created; pid = 2670
  Listening on port 2346
  target remote [...]:2346
  Remote debugging using [...]:2346
  Reading symbols from .../lib64/ld.so.1...done.
  [Switching to Thread <main>]
  (gdb) continue
  Cannot execute this command without a live selected thread.
  (gdb)

The problem is:

  (gdb) c
  Cannot execute this command without a live selected thread.
  (gdb) info threads
    Id   Target Id         Frame
    1    Thread 14917      0x00007f341cd98ed0 in _start () from /lib64/ld-linux-x86-64.so.2

  The current thread <Thread ID 2> has terminated.  See `help thread'.
		      ^^^^^^^^^^^
  (gdb)

Note, thread _2_.  There's really only one thread in the inferior
(it's still at the entry point), but still GDB added a bogus second
thread.

The reason GDB started adding a second thread after 5cd63fd is
this hunk:

+                 if (event->ptid == null_ptid)
+                   {
+                     const char *thr = strstr (p1 + 1, ";thread:");
+                     if (thr != NULL)
+                       event->ptid = read_ptid (thr + strlen (";thread:"),
+                                                NULL);
+                     else
+                       event->ptid = magic_null_ptid;
+                   }

Note the else branch that falls back to magic_null_ptid.  We reach
that when we process the initial stop reply sent back in response to
the the "?" (status) packet early in the connection setup:

 Sending packet: $?#3f...Ack
 Packet received: T0506:0000000000000000;07:40a510f4fd7f0000;10:d0fe1201577f0000;

And note that that response does not include a ";thread:XXX" part.

This stop reply is processed after listing threads with qfThreadInfo /
qsThreadInfo :

 Sending packet: $qfThreadInfo#bb...Ack
 Packet received: m3915
 Sending packet: $qsThreadInfo#c8...Ack
 Packet received: l

meaning, when we process that stop reply, we treat the event as coming
from a thread with ptid == magic_null_ptid, which is not yet in the
thread list, so we add it then:

  (top-gdb) p ptid
  $1 = {m_pid = 42000, m_lwp = -1, m_tid = 1}
  (top-gdb) bt
  #0  0x0000000000840a8c in add_thread_silent(ptid_t) (ptid=...) at src/gdb/thread.c:269
  #1  0x00000000007ad61d in remote_add_thread(ptid_t, int, int) (ptid=..., running=0, executing=0)
      at src/gdb/remote.c:1838
  #2  0x00000000007ad8de in remote_notice_new_inferior(ptid_t, int) (currthread=..., executing=0)
      at src/gdb/remote.c:1921
  #3  0x00000000007b758b in process_stop_reply(stop_reply*, target_waitstatus*) (stop_reply=0x1158860, status=0x7fffffffcc00)
      at src/gdb/remote.c:7217
  #4  0x00000000007b7a38 in remote_wait_as(ptid_t, target_waitstatus*, int) (ptid=..., status=0x7fffffffcc00, options=0)
      at src/gdb/remote.c:7380
  #5  0x00000000007b7cd1 in remote_wait(target_ops*, ptid_t, target_waitstatus*, int) (ops=0x102fac0 <remote_ops>, ptid=..., status=0x7fffffffcc00, options=0) at src/gdb/remote.c:7446
  #6  0x000000000081587b in delegate_wait(target_ops*, ptid_t, target_waitstatus*, int) (self=0x102fac0 <remote_ops>, arg1=..., arg2=0x7fffffffcc00, arg3=0) at src/gdb/target-delegates.c:138
  #7  0x0000000000827d77 in target_wait(ptid_t, target_waitstatus*, int) (ptid=..., status=0x7fffffffcc00, options=0)
      at src/gdb/target.c:2179
  #8  0x0000000000715fda in do_target_wait(ptid_t, target_waitstatus*, int) (ptid=..., status=0x7fffffffcc00, options=0)
      at src/gdb/infrun.c:3589
  #9  0x0000000000716351 in wait_for_inferior() () at src/gdb/infrun.c:3707
  #10 0x0000000000715435 in start_remote(int) (from_tty=1) at src/gdb/infrun.c:3212

things go downhill from this.

We don't see the problem with current master gdbserver, because that
version always sends the ";thread:" part in the initial stop reply:

 Sending packet: $?#3f...Packet received: T0506:0000000000000000;07:a0d4ffffff7f0000;10:d05eddf7ff7f0000;thread:p3cea.3cea;core:3;

Years ago I had added a "--disable-packet=" command line option to
gdbserver which comes in handy for testing this, since the existing
"--disable-packet=Tthread" precisely makes gdbserver not send that
";thread:" part in stop replies.  The testcase added by this commit
emulates old gdbserver making use of that.

I've compared a testrun at 5cd63fd^ (before regression) with
'current master+patch', against old gdbserver at f8b73d1^.  I
hacked out --once, and "monitor exit" to be able to test.  The results
are a bit too unstable to tell accurately, but it looked like there
were no regressions.  Maciej confirmed this worked for him as well.

No regressions on master (against master gdbserver).

gdb/ChangeLog:
2018-01-11  Pedro Alves  <palves@redhat.com>

	PR remote/22597
	* remote.c (remote_parse_stop_reply): Default to the last-set
	general thread instead of to 'magic_null_ptid'.

gdb/testsuite/ChangeLog:
2018-01-11  Pedro Alves  <palves@redhat.com>

	PR remote/22597
	* gdb.server/stop-reply-no-thread.c: New file.
	* gdb.server/stop-reply-no-thread.exp: New file.
tromey pushed a commit that referenced this issue Feb 20, 2018
When running the test gdb.dwarf2/dw2-bad-parameter-type.exp under
valgrind, I see the following issue reported (on x86-64 Fedora):

  (gdb) ptype f
  ==5203== Invalid read of size 1
  ==5203==    at 0x6931FE: process_die_scope::~process_die_scope() (dwarf2read.c:10642)
  ==5203==    by 0x66818F: process_die(die_info*, dwarf2_cu*) (dwarf2read.c:10664)
  ==5203==    by 0x66A01F: read_file_scope(die_info*, dwarf2_cu*) (dwarf2read.c:11650)
  ==5203==    by 0x667F2D: process_die(die_info*, dwarf2_cu*) (dwarf2read.c:10672)
  ==5203==    by 0x6677B6: process_full_comp_unit(dwarf2_per_cu_data*, language) (dwarf2read.c:10445)
  ==5203==    by 0x66657A: process_queue(dwarf2_per_objfile*) (dwarf2read.c:9945)
  ==5203==    by 0x6559B4: dw2_do_instantiate_symtab(dwarf2_per_cu_data*) (dwarf2read.c:3163)
  ==5203==    by 0x66683D: psymtab_to_symtab_1(partial_symtab*) (dwarf2read.c:10034)
  ==5203==    by 0x66622A: dwarf2_read_symtab(partial_symtab*, objfile*) (dwarf2read.c:9811)
  ==5203==    by 0x787984: psymtab_to_symtab(objfile*, partial_symtab*) (psymtab.c:792)
  ==5203==    by 0x786E3E: psym_lookup_symbol(objfile*, int, char const*, domain_enum_tag) (psymtab.c:522)
  ==5203==    by 0x804BD0: lookup_symbol_via_quick_fns(objfile*, int, char const*, domain_enum_tag) (symtab.c:2383)
  ==5203==  Address 0x147ed063 is 291 bytes inside a block of size 4,064 free'd
  ==5203==    at 0x4C2CD5A: free (vg_replace_malloc.c:530)
  ==5203==    by 0x444415: void xfree<void>(void*) (common-utils.h:60)
  ==5203==    by 0x9DA8C2: call_freefun (obstack.c:103)
  ==5203==    by 0x9DAD35: _obstack_free (obstack.c:280)
  ==5203==    by 0x44464C: auto_obstack::~auto_obstack() (gdb_obstack.h:73)
  ==5203==    by 0x68AFB0: dwarf2_cu::~dwarf2_cu() (dwarf2read.c:25080)
  ==5203==    by 0x68B204: free_one_cached_comp_unit(dwarf2_per_cu_data*) (dwarf2read.c:25174)
  ==5203==    by 0x66668C: dwarf2_release_queue(void*) (dwarf2read.c:9982)
  ==5203==    by 0x563A4C: do_my_cleanups(cleanup**, cleanup*) (cleanups.c:154)
  ==5203==    by 0x563AA7: do_cleanups(cleanup*) (cleanups.c:176)
  ==5203==    by 0x5646CE: throw_exception_cxx(gdb_exception) (common-exceptions.c:289)
  ==5203==    by 0x5647B7: throw_exception(gdb_exception) (common-exceptions.c:317)
  ==5203==  Block was alloc'd at
  ==5203==    at 0x4C2BBAD: malloc (vg_replace_malloc.c:299)
  ==5203==    by 0x564BE8: xmalloc (common-utils.c:44)
  ==5203==    by 0x9DA872: call_chunkfun (obstack.c:94)
  ==5203==    by 0x9DA935: _obstack_begin_worker (obstack.c:141)
  ==5203==    by 0x9DAA3C: _obstack_begin (obstack.c:164)
  ==5203==    by 0x4445E0: auto_obstack::auto_obstack() (gdb_obstack.h:70)
  ==5203==    by 0x68AE07: dwarf2_cu::dwarf2_cu(dwarf2_per_cu_data*) (dwarf2read.c:25073)
  ==5203==    by 0x661A8A: init_cutu_and_read_dies(dwarf2_per_cu_data*, abbrev_table*, int, int, void (*)(die_reader_specs const*, unsigned char const*, die_info*, int, void*), void*) (dwarf2read.c:7869)
  ==5203==    by 0x666A29: load_full_comp_unit(dwarf2_per_cu_data*, language) (dwarf2read.c:10108)
  ==5203==    by 0x655847: load_cu(dwarf2_per_cu_data*) (dwarf2read.c:3120)
  ==5203==    by 0x655928: dw2_do_instantiate_symtab(dwarf2_per_cu_data*) (dwarf2read.c:3148)
  ==5203==    by 0x66683D: psymtab_to_symtab_1(partial_symtab*) (dwarf2read.c:10034)

There's actually a series of three issues reported, but it turns out
they're all related, so we can consider on the first one.

The invalid read is triggered from a destructor which is being invoked
as part of a stack unwind after throwing an error.  At the time the
error is thrown, the stack looks like this:

    #0  0x00000000009f4ecd in __cxa_throw ()
    #1  0x0000000000564761 in throw_exception_cxx (exception=...) at ../../src/gdb/common/common-exceptions.c:303
    #2  0x00000000005647b8 in throw_exception (exception=...) at ../../src/gdb/common/common-exceptions.c:317
    #3  0x00000000005648ff in throw_it(return_reason, errors, const char *, typedef __va_list_tag __va_list_tag *) (reason=RETURN_ERROR,
        error=GENERIC_ERROR, fmt=0xb33020 "Dwarf Error: Cannot find DIE at 0x%x referenced from DIE at 0x%x [in module %s]",
        ap=0x7fff387f2d68) at ../../src/gdb/common/common-exceptions.c:373
    #4  0x0000000000564929 in throw_verror (error=GENERIC_ERROR,
        fmt=0xb33020 "Dwarf Error: Cannot find DIE at 0x%x referenced from DIE at 0x%x [in module %s]", ap=0x7fff387f2d68)
        at ../../src/gdb/common/common-exceptions.c:379
    #5  0x0000000000867be4 in verror (string=0xb33020 "Dwarf Error: Cannot find DIE at 0x%x referenced from DIE at 0x%x [in module %s]",
        args=0x7fff387f2d68) at ../../src/gdb/utils.c:251
    #6  0x000000000056879d in error (fmt=0xb33020 "Dwarf Error: Cannot find DIE at 0x%x referenced from DIE at 0x%x [in module %s]")
        at ../../src/gdb/common/errors.c:43
    #7  0x0000000000686875 in follow_die_ref (src_die=0x30bc8a0, attr=0x30bc8c8, ref_cu=0x7fff387f2ed0) at ../../src/gdb/dwarf2read.c:22969
    #8  0x00000000006844cd in lookup_die_type (die=0x30bc8a0, attr=0x30bc8c8, cu=0x30bc5d0) at ../../src/gdb/dwarf2read.c:21976
    #9  0x0000000000683f27 in die_type (die=0x30bc8a0, cu=0x30bc5d0) at ../../src/gdb/dwarf2read.c:21832
    #10 0x0000000000679b39 in read_subroutine_type (die=0x30bc830, cu=0x30bc5d0) at ../../src/gdb/dwarf2read.c:17343
    #11 0x00000000006845fb in read_type_die_1 (die=0x30bc830, cu=0x30bc5d0) at ../../src/gdb/dwarf2read.c:22035
    #12 0x0000000000684576 in read_type_die (die=0x30bc830, cu=0x30bc5d0) at ../../src/gdb/dwarf2read.c:22010
    #13 0x000000000067003f in read_func_scope (die=0x30bc830, cu=0x30bc5d0) at ../../src/gdb/dwarf2read.c:13822
    #14 0x0000000000667f5e in process_die (die=0x30bc830, cu=0x30bc5d0) at ../../src/gdb/dwarf2read.c:10679
    #15 0x000000000066a020 in read_file_scope (die=0x30bc720, cu=0x30bc5d0) at ../../src/gdb/dwarf2read.c:11650
    #16 0x0000000000667f2e in process_die (die=0x30bc720, cu=0x30bc5d0) at ../../src/gdb/dwarf2read.c:10672
    #17 0x00000000006677b7 in process_full_comp_unit (per_cu=0x3089b80, pretend_language=language_minimal)
        at ../../src/gdb/dwarf2read.c:10445
    #18 0x000000000066657b in process_queue (dwarf2_per_objfile=0x30897d0) at ../../src/gdb/dwarf2read.c:9945
    #19 0x00000000006559b5 in dw2_do_instantiate_symtab (per_cu=0x3089b80) at ../../src/gdb/dwarf2read.c:3163
    #20 0x000000000066683e in psymtab_to_symtab_1 (pst=0x3089bd0) at ../../src/gdb/dwarf2read.c:10034
    #21 0x000000000066622b in dwarf2_read_symtab (self=0x3089bd0, objfile=0x3073f40) at ../../src/gdb/dwarf2read.c:9811
    #22 0x0000000000787985 in psymtab_to_symtab (objfile=0x3073f40, pst=0x3089bd0) at ../../src/gdb/psymtab.c:792
    #23 0x0000000000786e3f in psym_lookup_symbol (objfile=0x3073f40, block_index=1, name=0x30b2e30 "f", domain=VAR_DOMAIN)
        at ../../src/gdb/psymtab.c:522
    #24 0x0000000000804bd1 in lookup_symbol_via_quick_fns (objfile=0x3073f40, block_index=1, name=0x30b2e30 "f", domain=VAR_DOMAIN)
        at ../../src/gdb/symtab.c:2383
    #25 0x0000000000804fe4 in lookup_symbol_in_objfile (objfile=0x3073f40, block_index=1, name=0x30b2e30 "f", domain=VAR_DOMAIN)
        at ../../src/gdb/symtab.c:2558
    #26 0x0000000000805125 in lookup_static_symbol (name=0x30b2e30 "f", domain=VAR_DOMAIN) at ../../src/gdb/symtab.c:2595
    #27 0x0000000000804357 in lookup_symbol_aux (name=0x30b2e30 "f", match_type=symbol_name_match_type::FULL, block=0x0,
        domain=VAR_DOMAIN, language=language_c, is_a_field_of_this=0x0) at ../../src/gdb/symtab.c:2105
    #28 0x0000000000803ad9 in lookup_symbol_in_language (name=0x30b2e30 "f", block=0x0, domain=VAR_DOMAIN, lang=language_c,
        is_a_field_of_this=0x0) at ../../src/gdb/symtab.c:1887
    #29 0x0000000000803b53 in lookup_symbol (name=0x30b2e30 "f", block=0x0, domain=VAR_DOMAIN, is_a_field_of_this=0x0)
        at ../../src/gdb/symtab.c:1899
    #30 0x000000000053b246 in classify_name (par_state=0x7fff387f6090, block=0x0, is_quoted_name=false, is_after_structop=false)
        at ../../src/gdb/c-exp.y:2879
    #31 0x000000000053b7e9 in c_yylex () at ../../src/gdb/c-exp.y:3083
    #32 0x000000000053414a in c_yyparse () at c-exp.c:1903
    #33 0x000000000053c2e7 in c_parse (par_state=0x7fff387f6090) at ../../src/gdb/c-exp.y:3255
    #34 0x0000000000774a02 in parse_exp_in_context_1 (stringptr=0x7fff387f61c0, pc=0, block=0x0, comma=0, void_context_p=0, out_subexp=0x0)
        at ../../src/gdb/parse.c:1213
    #35 0x000000000077476a in parse_exp_in_context (stringptr=0x7fff387f61c0, pc=0, block=0x0, comma=0, void_context_p=0, out_subexp=0x0)
        at ../../src/gdb/parse.c:1115
    #36 0x0000000000774714 in parse_exp_1 (stringptr=0x7fff387f61c0, pc=0, block=0x0, comma=0) at ../../src/gdb/parse.c:1106
    #37 0x0000000000774c53 in parse_expression (string=0x27ff996 "f") at ../../src/gdb/parse.c:1253
    #38 0x0000000000861dc4 in whatis_exp (exp=0x27ff996 "f", show=1) at ../../src/gdb/typeprint.c:472
    #39 0x00000000008620d8 in ptype_command (type_name=0x27ff996 "f", from_tty=1) at ../../src/gdb/typeprint.c:561
    #40 0x000000000047430b in do_const_cfunc (c=0x3012010, args=0x27ff996 "f", from_tty=1) at ../../src/gdb/cli/cli-decode.c:106
    #41 0x000000000047715e in cmd_func (cmd=0x3012010, args=0x27ff996 "f", from_tty=1) at ../../src/gdb/cli/cli-decode.c:1886
    #42 0x00000000008431bb in execute_command (p=0x27ff996 "f", from_tty=1) at ../../src/gdb/top.c:630
    #43 0x00000000006bf946 in command_handler (command=0x27ff990 "ptype f") at ../../src/gdb/event-top.c:583
    #44 0x00000000006bfd12 in command_line_handler (rl=0x30bb3a0 "\240\305\v\003") at ../../src/gdb/event-top.c:774

The problem is that in `process_die` (frames 14 and 16) we create a
`process_die_scope` object, that takes a copy of the `struct
dwarf2_cu *` passed into the frame.  The destructor of the
`process_die_scope` dereferences the stored pointer.  This wouldn't be
an issue, except...

... in dw2_do_instantiate_symtab (frame 19) a clean up was registered that
clears the dwarf2_queue in case of an error.  Part of this clean up
involves deleting the `struct dwarf2_cu`s referenced from the queue..

The problem then, is that cleanups are processed at the site of the
throw, while, class destructors are invoked as we unwind their frame.
The result is that we process the frame 19 cleanup (and delete the
struct dwarf2_cu) before we process the destructors in frames 14 and 16.
When we do get back to frames 14 and 16 the objects being references
have already been deleted.

The solution is to remove the cleanup from dw2_do_instantiate_symtab, and
instead use a destructor to release the dwarf2_queue instead.  With this
patch in place, the valgrind errors are now resolved.

gdb/ChangeLog:

	* dwarf2read.c (dwarf2_release_queue): Delete function, move body
	into...
	(class dwarf2_queue_guard): ...the destructor of this new class.
	(dw2_do_instantiate_symtab): Create instance of the new class
	dwarf2_queue_guard, remove cleanup.
tromey pushed a commit that referenced this issue Apr 20, 2018
A future patch will propose making the remote target's target_ops be
heap-allocated (to make it possible to have multiple instances of
remote targets, for multiple simultaneous connections), and will
delete/destroy the remote target at target_close time.

That change trips on a latent problem, though.  File I/O handles
remain open even after the target is gone, with a dangling pointer to
a target that no longer exists.  This results in GDB crashing when it
calls the target_ops backend associated with the file handle:

 (gdb) Disconnect
 Ending remote debugging.
 * GDB crashes deferencing a dangling pointer

Backtrace:

   #0  0x00007f79338570a0 in main_arena () at /lib64/libc.so.6
   #1  0x0000000000858bfe in target_fileio_close(int, int*) (fd=1, target_errno=0x7ffe0499a4c8)
       at src/gdb/target.c:2980
   #2  0x00000000007088bd in gdb_bfd_iovec_fileio_close(bfd*, void*) (abfd=0x1a631b0, stream=0x223c9d0)
       at src/gdb/gdb_bfd.c:353
   #3  0x0000000000930906 in opncls_bclose (abfd=0x1a631b0) at src/bfd/opncls.c:528
   #4  0x0000000000930cf9 in bfd_close_all_done (abfd=0x1a631b0) at src/bfd/opncls.c:768
   #5  0x0000000000930cb3 in bfd_close (abfd=0x1a631b0) at src/bfd/opncls.c:735
   #6  0x0000000000708dc5 in gdb_bfd_close_or_warn(bfd*) (abfd=0x1a631b0) at src/gdb/gdb_bfd.c:511
   #7  0x00000000007091a2 in gdb_bfd_unref(bfd*) (abfd=0x1a631b0) at src/gdb/gdb_bfd.c:615
   #8  0x000000000079ed8e in objfile::~objfile() (this=0x2154730, __in_chrg=<optimized out>)
       at src/gdb/objfiles.c:682
   #9  0x000000000079fd1a in objfile_purge_solibs() () at src/gdb/objfiles.c:1065
   #10 0x00000000008162ca in no_shared_libraries(char const*, int) (ignored=0x0, from_tty=1)
       at src/gdb/solib.c:1251
   #11 0x000000000073b89b in disconnect_command(char const*, int) (args=0x0, from_tty=1)
       at src/gdb/infcmd.c:3035

This goes unnoticed in current master, because the current remote
target's target_ops is never destroyed nowadays, so we end up calling:

  remote_hostio_close -> remote_hostio_send_command

which gracefully fails with FILEIO_ENOSYS if remote_desc is NULL
(because the target is closed).

Fix this by invalidating a target's file I/O handles when the target
is closed.

With this change, remote_hostio_send_command no longer needs to handle the
case of being called with a closed remote target, originally added here:
<https://sourceware.org/ml/gdb-patches/2008-08/msg00359.html>.

gdb/ChangeLog:
2018-04-11  Pedro Alves  <palves@redhat.com>

	* target.c (fileio_fh_t::t): Add comment.
	(target_fileio_pwrite, target_fileio_pread, target_fileio_fstat)
	(target_fileio_close): Handle a NULL target.
	(invalidate_fileio_fh): New.
	(target_close): Call it.
	* remote.c (remote_hostio_send_command): No longer check whether
	remote_desc is open.
tromey pushed a commit that referenced this issue May 22, 2018
At <https://sourceware.org/ml/gdb-patches/2017-12/msg00285.html>,
Maciej reported that commit:

  commit 5cd63fd
  Date: Wed Oct 4 18:21:10 2017 +0100
  Subject: Fix "Remote 'g' packet reply is too long" problems with multiple inferiors

made GDB stop working with older stubs.  Any attempt to continue
execution after the initial connection fails with:

  [...]
  Process .../gdb/testsuite/outputs/gdb.base/advance/advance created; pid = 2670
  Listening on port 2346
  target remote [...]:2346
  Remote debugging using [...]:2346
  Reading symbols from .../lib64/ld.so.1...done.
  [Switching to Thread <main>]
  (gdb) continue
  Cannot execute this command without a live selected thread.
  (gdb)

The problem is:

  (gdb) c
  Cannot execute this command without a live selected thread.
  (gdb) info threads
    Id   Target Id         Frame
    1    Thread 14917      0x00007f341cd98ed0 in _start () from /lib64/ld-linux-x86-64.so.2

  The current thread <Thread ID 2> has terminated.  See `help thread'.
		      ^^^^^^^^^^^
  (gdb)

Note, thread _2_.  There's really only one thread in the inferior
(it's still at the entry point), but still GDB added a bogus second
thread.

The reason GDB started adding a second thread after 5cd63fd is
this hunk:

+                 if (event->ptid == null_ptid)
+                   {
+                     const char *thr = strstr (p1 + 1, ";thread:");
+                     if (thr != NULL)
+                       event->ptid = read_ptid (thr + strlen (";thread:"),
+                                                NULL);
+                     else
+                       event->ptid = magic_null_ptid;
+                   }

Note the else branch that falls back to magic_null_ptid.  We reach
that when we process the initial stop reply sent back in response to
the the "?" (status) packet early in the connection setup:

 Sending packet: $?#3f...Ack
 Packet received: T0506:0000000000000000;07:40a510f4fd7f0000;10:d0fe1201577f0000;

And note that that response does not include a ";thread:XXX" part.

This stop reply is processed after listing threads with qfThreadInfo /
qsThreadInfo :

 Sending packet: $qfThreadInfo#bb...Ack
 Packet received: m3915
 Sending packet: $qsThreadInfo#c8...Ack
 Packet received: l

meaning, when we process that stop reply, we treat the event as coming
from a thread with ptid == magic_null_ptid, which is not yet in the
thread list, so we add it then:

  (top-gdb) p ptid
  $1 = {m_pid = 42000, m_lwp = -1, m_tid = 1}
  (top-gdb) bt
  #0  0x0000000000840a8c in add_thread_silent(ptid_t) (ptid=...) at src/gdb/thread.c:269
  #1  0x00000000007ad61d in remote_add_thread(ptid_t, int, int) (ptid=..., running=0, executing=0)
      at src/gdb/remote.c:1838
  #2  0x00000000007ad8de in remote_notice_new_inferior(ptid_t, int) (currthread=..., executing=0)
      at src/gdb/remote.c:1921
  #3  0x00000000007b758b in process_stop_reply(stop_reply*, target_waitstatus*) (stop_reply=0x1158860, status=0x7fffffffcc00)
      at src/gdb/remote.c:7217
  #4  0x00000000007b7a38 in remote_wait_as(ptid_t, target_waitstatus*, int) (ptid=..., status=0x7fffffffcc00, options=0)
      at src/gdb/remote.c:7380
  #5  0x00000000007b7cd1 in remote_wait(target_ops*, ptid_t, target_waitstatus*, int) (ops=0x102fac0 <remote_ops>, ptid=..., status=0x7fffffffcc00, options=0) at src/gdb/remote.c:7446
  #6  0x000000000081587b in delegate_wait(target_ops*, ptid_t, target_waitstatus*, int) (self=0x102fac0 <remote_ops>, arg1=..., arg2=0x7fffffffcc00, arg3=0) at src/gdb/target-delegates.c:138
  #7  0x0000000000827d77 in target_wait(ptid_t, target_waitstatus*, int) (ptid=..., status=0x7fffffffcc00, options=0)
      at src/gdb/target.c:2179
  #8  0x0000000000715fda in do_target_wait(ptid_t, target_waitstatus*, int) (ptid=..., status=0x7fffffffcc00, options=0)
      at src/gdb/infrun.c:3589
  #9  0x0000000000716351 in wait_for_inferior() () at src/gdb/infrun.c:3707
  #10 0x0000000000715435 in start_remote(int) (from_tty=1) at src/gdb/infrun.c:3212

things go downhill from this.

We don't see the problem with current master gdbserver, because that
version always sends the ";thread:" part in the initial stop reply:

 Sending packet: $?#3f...Packet received: T0506:0000000000000000;07:a0d4ffffff7f0000;10:d05eddf7ff7f0000;thread:p3cea.3cea;core:3;

Years ago I had added a "--disable-packet=" command line option to
gdbserver which comes in handy for testing this, since the existing
"--disable-packet=Tthread" precisely makes gdbserver not send that
";thread:" part in stop replies.  The testcase added by this commit
emulates old gdbserver making use of that.

I've compared a testrun at 5cd63fd^ (before regression) with
'current master+patch', against old gdbserver at f8b73d1^.  I
hacked out --once, and "monitor exit" to be able to test.  The results
are a bit too unstable to tell accurately, but it looked like there
were no regressions.  Maciej confirmed this worked for him as well.

No regressions on master (against master gdbserver).

gdb/ChangeLog:
2018-01-11  Pedro Alves  <palves@redhat.com>

	PR remote/22597
	* remote.c (remote_parse_stop_reply): Default to the last-set
	general thread instead of to 'magic_null_ptid'.

gdb/testsuite/ChangeLog:
2018-01-11  Pedro Alves  <palves@redhat.com>

	PR remote/22597
	* gdb.server/stop-reply-no-thread.c: New file.
	* gdb.server/stop-reply-no-thread.exp: New file.
tromey pushed a commit that referenced this issue Aug 27, 2018
…b/23379)

This commit fixes a 8.1->8.2 regression exposed by
gdb.python/py-evthreads.exp when testing with
--target_board=native-gdbserver.

gdb.log shows:

  src/gdb/thread.c:93: internal-error: thread_info* inferior_thread(): Assertion `tp' failed.
  A problem internal to GDB has been detected,
  further debugging may prove unreliable.
  Quit this debugging session? (y or n) FAIL: gdb.python/py-evthreads.exp: run to breakpoint 1 (GDB internal error)

A backtrace shows (frames #2 and #10 highlighted) that the assertion
fails when GDB is setting up the connection to the remote target, in
non-stop mode:

  #0  0x0000000000622ff0 in internal_error(char const*, int, char const*, ...) (file=0xc1ad98 "src/gdb/thread.c", line=93, fmt=0xc1ad20 "%s: Assertion `%s' failed.") at src/gdb/common/errors.c:54
  #1  0x000000000089567e in inferior_thread() () at src/gdb/thread.c:93
= #2  0x00000000004da91d in get_event_thread() () at src/gdb/python/py-threadevent.c:38
  #3  0x00000000004da9b7 in create_thread_event_object(_typeobject*, _object*) (py_type=0x11574c0 <continue_event_object_type>, thread=0x0)
      at src/gdb/python/py-threadevent.c:60
  #4  0x00000000004bf6fe in create_continue_event_object() () at src/gdb/python/py-continueevent.c:27
  #5  0x00000000004bf738 in emit_continue_event(ptid_t) (ptid=...) at src/gdb/python/py-continueevent.c:40
  #6  0x00000000004c7d47 in python_on_resume(ptid_t) (ptid=...) at src/gdb/python/py-inferior.c:108
  #7  0x0000000000485bfb in std::_Function_handler<void (ptid_t), void (*)(ptid_t)>::_M_invoke(std::_Any_data const&, ptid_t&&) (__functor=..., __args#0=...) at /usr/include/c++/7/bits/std_function.h:316
  #8  0x000000000089b416 in std::function<void (ptid_t)>::operator()(ptid_t) const (this=0x12aa600, __args#0=...)
      at /usr/include/c++/7/bits/std_function.h:706
  #9  0x000000000089aa0e in gdb::observers::observable<ptid_t>::notify(ptid_t) const (this=0x118a7a0 <gdb::observers::target_resumed>, args#0=...)
      at src/gdb/common/observable.h:106
= #10 0x0000000000896fbe in set_running(ptid_t, int) (ptid=..., running=1) at src/gdb/thread.c:880
  #11 0x00000000007f750f in remote_target::remote_add_thread(ptid_t, bool, bool) (this=0x12c5440, ptid=..., running=true, executing=true) at src/gdb/remote.c:2434
  #12 0x00000000007f779d in remote_target::remote_notice_new_inferior(ptid_t, int) (this=0x12c5440, currthread=..., executing=1)
      at src/gdb/remote.c:2515
  #13 0x00000000007f9c44 in remote_target::update_thread_list() (this=0x12c5440) at src/gdb/remote.c:3831
  #14 0x00000000007fb922 in remote_target::start_remote(int, int) (this=0x12c5440, from_tty=0, extended_p=0)
      at src/gdb/remote.c:4655
  #15 0x00000000007fd102 in remote_target::open_1(char const*, int, int) (name=0x1a4f45e "localhost:2346", from_tty=0, extended_p=0)
      at src/gdb/remote.c:5638
  #16 0x00000000007fbec1 in remote_target::open(char const*, int) (name=0x1a4f45e "localhost:2346", from_tty=0)
      at src/gdb/remote.c:4862

So on frame #10, we're marking a newly-discovered thread as running,
and that causes the Python API to emit a gdb.ContinueEvent.
gdb.ContinueEvent is a gdb.ThreadEvent, and as such includes the event
thread as the "inferior_thread" attribute.  The problem is that when
we get to frame #3/#4, we lost all references to the thread that is
being marked as running.  create_continue_event_object assumes that it
is the current thread, which is not true in this case.

Fix this by passing down the right thread in
create_continue_event_object.  Also remove
create_thread_event_object's default argument and have the only other
caller left pass down the right thread explicitly too.

gdb/ChangeLog:
2018-08-24  Pedro Alves  <palves@redhat.com>
	    Simon Marchi  <simon.marchi@ericsson.com>

	PR gdb/23379
	* python/py-continueevent.c: Include "gdbthread.h".
	(create_continue_event_object): Add intro comment.  Add 'ptid'
	parameter.  Use it to find thread to pass to
	create_thread_event_object.
	(emit_continue_event): Pass PTID down to
	create_continue_event_object.
	* python/py-event.h (py_get_event_thread): Declare.
	(create_thread_event_object): Remove default from 'thread'
	parameter.
	* python/py-stopevent.c (create_stop_event_object): Use
	py_get_event_thread.
	* python/py-threadevent.c (get_event_thread): Rename to ...
	(py_get_event_thread): ... this, make extern, add 'ptid' parameter
	and use it to find the thread.
	(create_thread_event_object): Assert that THREAD isn't null.
	Don't find the event thread here.
tromey pushed a commit that referenced this issue Nov 24, 2018
A following commit to make each inferior have its own thread list
exposes a problem with bf93d7b ("Add thread after updating gdbarch
when exec'ing"), which is that we can't defer adding the thread
because that breaks try_open_exec_file which deep inside ends up
calling inferior_thread():

 #5  0x0000000000637c78 in internal_error(char const*, int, char const*, ...) (file=0xc151f8 "src/gdb/thread.c", line=165, fmt=0xc15180 "%s: Assertion `%s' failed.") at src/gdb/common/errors.c:55
 #6  0x00000000008a3d80 in inferior_thread() () at src/gdb/thread.c:165
 #7  0x0000000000456f91 in try_thread_db_load_1(thread_db_info*) (info=0x277eb00) at src/gdb/linux-thread-db.c:830
 #8  0x0000000000457554 in try_thread_db_load(char const*, int) (library=0xb01a4f "libthread_db.so.1", check_auto_load_safe=0)
     at src/gdb/linux-thread-db.c:1002
 #9  0x0000000000457861 in try_thread_db_load_from_sdir() () at src/gdb/linux-thread-db.c:1079
 #10 0x0000000000457b72 in thread_db_load_search() () at src/gdb/linux-thread-db.c:1134
 #11 0x0000000000457d29 in thread_db_load() () at src/gdb/linux-thread-db.c:1192
 #12 0x0000000000457e51 in check_for_thread_db() () at src/gdb/linux-thread-db.c:1244
 #13 0x0000000000457ed2 in thread_db_new_objfile(objfile*) (objfile=0x270ff60) at src/gdb/linux-thread-db.c:1273
 #14 0x000000000045a92e in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7ffef3efe140: 0x270ff60) at /usr/include/c++/7/bits/std_function.h:316
 #15 0x00000000007bbebf in std::function<void (objfile*)>::operator()(objfile*) const (this=0x24e1d18, __args#0=0x270ff60)
     at /usr/include/c++/7/bits/std_function.h:706
 #16 0x00000000007bba86 in gdb::observers::observable<objfile*>::notify(objfile*) const (this=0x117ce80 <gdb::observers::new_objfile>, args#0=0x270ff60) at src/gdb/common/observable.h:106
 #17 0x0000000000856000 in symbol_file_add_with_addrs(bfd*, char const*, symfile_add_flags, section_addr_info*, objfile_flags, objfile*) (abfd=0x1d7dae0, name=0x254bfc0 "/ho

The problem is latent currently because inferior_thread() at that
point manages to return a thread, even though it's the wrong one (of
the old inferior).

The problem originally fixed by bf93d7b was:

    (...) we should avoid doing register reads
    after a process does an exec and before we've updated that inferior's
    gdbarch.  Otherwise, we may interpret the registers using the wrong
    architecture.

    (...) The call to "add_thread" done just after adding the inferior is
    problematic, because it ends up reading the registers (because the ptid
    is re-used, we end up doing a switch_to_thread to it, which tries to
    update stop_pc). (...)

The register-reading issue is no longer a problem nowadays, ever since
switch_to_thread stopped reading the stop_pc in git commit
f2ffa92 ("gdb: Eliminate the 'stop_pc' global").

So this commit basically reverts bf93d7b.

gdb/ChangeLog:
2018-11-22  Pedro Alves  <palves@redhat.com>

	* infrun.c (follow_exec) <set follow-exec new>: Add thread and
	switch to it before calling into try_open_exec_file.
tromey pushed a commit that referenced this issue Dec 14, 2018
…b/23379)

This commit fixes a 8.1->8.2 regression exposed by
gdb.python/py-evthreads.exp when testing with
--target_board=native-gdbserver.

gdb.log shows:

  src/gdb/thread.c:93: internal-error: thread_info* inferior_thread(): Assertion `tp' failed.
  A problem internal to GDB has been detected,
  further debugging may prove unreliable.
  Quit this debugging session? (y or n) FAIL: gdb.python/py-evthreads.exp: run to breakpoint 1 (GDB internal error)

A backtrace shows (frames #2 and #10 highlighted) that the assertion
fails when GDB is setting up the connection to the remote target, in
non-stop mode:

  #0  0x0000000000622ff0 in internal_error(char const*, int, char const*, ...) (file=0xc1ad98 "src/gdb/thread.c", line=93, fmt=0xc1ad20 "%s: Assertion `%s' failed.") at src/gdb/common/errors.c:54
  #1  0x000000000089567e in inferior_thread() () at src/gdb/thread.c:93
= #2  0x00000000004da91d in get_event_thread() () at src/gdb/python/py-threadevent.c:38
  #3  0x00000000004da9b7 in create_thread_event_object(_typeobject*, _object*) (py_type=0x11574c0 <continue_event_object_type>, thread=0x0)
      at src/gdb/python/py-threadevent.c:60
  #4  0x00000000004bf6fe in create_continue_event_object() () at src/gdb/python/py-continueevent.c:27
  #5  0x00000000004bf738 in emit_continue_event(ptid_t) (ptid=...) at src/gdb/python/py-continueevent.c:40
  #6  0x00000000004c7d47 in python_on_resume(ptid_t) (ptid=...) at src/gdb/python/py-inferior.c:108
  #7  0x0000000000485bfb in std::_Function_handler<void (ptid_t), void (*)(ptid_t)>::_M_invoke(std::_Any_data const&, ptid_t&&) (__functor=..., __args#0=...) at /usr/include/c++/7/bits/std_function.h:316
  #8  0x000000000089b416 in std::function<void (ptid_t)>::operator()(ptid_t) const (this=0x12aa600, __args#0=...)
      at /usr/include/c++/7/bits/std_function.h:706
  #9  0x000000000089aa0e in gdb::observers::observable<ptid_t>::notify(ptid_t) const (this=0x118a7a0 <gdb::observers::target_resumed>, args#0=...)
      at src/gdb/common/observable.h:106
= #10 0x0000000000896fbe in set_running(ptid_t, int) (ptid=..., running=1) at src/gdb/thread.c:880
  #11 0x00000000007f750f in remote_target::remote_add_thread(ptid_t, bool, bool) (this=0x12c5440, ptid=..., running=true, executing=true) at src/gdb/remote.c:2434
  #12 0x00000000007f779d in remote_target::remote_notice_new_inferior(ptid_t, int) (this=0x12c5440, currthread=..., executing=1)
      at src/gdb/remote.c:2515
  #13 0x00000000007f9c44 in remote_target::update_thread_list() (this=0x12c5440) at src/gdb/remote.c:3831
  #14 0x00000000007fb922 in remote_target::start_remote(int, int) (this=0x12c5440, from_tty=0, extended_p=0)
      at src/gdb/remote.c:4655
  #15 0x00000000007fd102 in remote_target::open_1(char const*, int, int) (name=0x1a4f45e "localhost:2346", from_tty=0, extended_p=0)
      at src/gdb/remote.c:5638
  #16 0x00000000007fbec1 in remote_target::open(char const*, int) (name=0x1a4f45e "localhost:2346", from_tty=0)
      at src/gdb/remote.c:4862

So on frame #10, we're marking a newly-discovered thread as running,
and that causes the Python API to emit a gdb.ContinueEvent.
gdb.ContinueEvent is a gdb.ThreadEvent, and as such includes the event
thread as the "inferior_thread" attribute.  The problem is that when
we get to frame #3/#4, we lost all references to the thread that is
being marked as running.  create_continue_event_object assumes that it
is the current thread, which is not true in this case.

Fix this by passing down the right thread in
create_continue_event_object.  Also remove
create_thread_event_object's default argument and have the only other
caller left pass down the right thread explicitly too.

gdb/ChangeLog:
2018-08-24  Pedro Alves  <palves@redhat.com>
	    Simon Marchi  <simon.marchi@ericsson.com>

	PR gdb/23379
	* python/py-continueevent.c: Include "gdbthread.h".
	(create_continue_event_object): Add intro comment.  Add 'ptid'
	parameter.  Use it to find thread to pass to
	create_thread_event_object.
	(emit_continue_event): Pass PTID down to
	create_continue_event_object.
	* python/py-event.h (py_get_event_thread): Declare.
	(create_thread_event_object): Remove default from 'thread'
	parameter.
	* python/py-stopevent.c (create_stop_event_object): Use
	py_get_event_thread.
	* python/py-threadevent.c (get_event_thread): Rename to ...
	(py_get_event_thread): ... this, make extern, add 'ptid' parameter
	and use it to find the thread.
	(create_thread_event_object): Assert that THREAD isn't null.
	Don't find the event thread here.
tromey pushed a commit that referenced this issue Nov 30, 2022
New in this version: add a dedicated test.

When I do this:

    $ ./gdb -nx --data-directory=data-directory -q \
        /bin/sleep \
	-ex "maint set target-non-stop on" \
	-ex "tar ext :1234" \
	-ex "set remote exec-file /bin/sleep" \
	-ex "run 1231 &" \
	-ex add-inferior \
	-ex "inferior 2"
    Reading symbols from /bin/sleep...
    (No debugging symbols found in /bin/sleep)
    Remote debugging using :1234
    Starting program: /bin/sleep 1231
    Reading /lib64/ld-linux-x86-64.so.2 from remote target...
    warning: File transfers from remote targets can be slow. Use "set sysroot" to access files locally instead.
    Reading /lib64/ld-linux-x86-64.so.2 from remote target...
    Reading /usr/lib/debug/.build-id/a6/7a1408f18db3576757eea210d07ba3fc560dff.debug from remote target...
    [New inferior 2]
    Added inferior 2 on connection 1 (extended-remote :1234)
    [Switching to inferior 2 [<null>] (<noexec>)]
    (gdb) Reading /lib/x86_64-linux-gnu/libc.so.6 from remote target...
    attach 3659848
    Attaching to process 3659848
    /home/smarchi/src/binutils-gdb/gdb/thread.c:85: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.

Note the "attach" command just above.  When doing it on the command-line
with a -ex switch, the bug doesn't trigger.

The internal error of GDB is actually caused by GDBserver crashing, and
the error recovery of GDB is not on point.  This patch aims to fix just
the GDBserver crash, not the GDB problem.

GDBserver crashes with a segfault here:

    (gdb) bt
    #0  0x00005555557fb3f4 in find_one_thread (ptid=...) at /home/smarchi/src/binutils-gdb/gdbserver/thread-db.cc:177
    #1  0x00005555557fd5cf in thread_db_thread_handle (ptid=<error reading variable: Cannot access memory at address 0xffffffffffffffa0>, handle=0x7fffffffc400, handle_len=0x7fffffffc3f0)
        at /home/smarchi/src/binutils-gdb/gdbserver/thread-db.cc:461
    #2  0x000055555578a0b6 in linux_process_target::thread_handle (this=0x5555558a64c0 <the_x86_target>, ptid=<error reading variable: Cannot access memory at address 0xffffffffffffffa0>, handle=0x7fffffffc400,
        handle_len=0x7fffffffc3f0) at /home/smarchi/src/binutils-gdb/gdbserver/linux-low.cc:6905
    #3  0x00005555556dfcc6 in handle_qxfer_threads_worker (thread=0x60b000000510, buffer=0x7fffffffc8a0) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1645
    #4  0x00005555556e00e6 in operator() (__closure=0x7fffffffc5e0, thread=0x60b000000510) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1696
    #5  0x00005555556f54be in for_each_thread<handle_qxfer_threads_proper(buffer*)::<lambda(thread_info*)> >(struct {...}) (func=...) at /home/smarchi/src/binutils-gdb/gdbserver/gdbthread.h:159
    #6  0x00005555556e0242 in handle_qxfer_threads_proper (buffer=0x7fffffffc8a0) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1694
    #7  0x00005555556e04ba in handle_qxfer_threads (annex=0x629000000213 "", readbuf=0x621000019100 '\276' <repeats 200 times>..., writebuf=0x0, offset=0, len=4097)
        at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1732
    #8  0x00005555556e1989 in handle_qxfer (own_buf=0x629000000200 "qXfer:threads", packet_len=26, new_packet_len_p=0x7fffffffd630) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:2045
    #9  0x00005555556e720a in handle_query (own_buf=0x629000000200 "qXfer:threads", packet_len=26, new_packet_len_p=0x7fffffffd630) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:2685
    #10 0x00005555556f1a01 in process_serial_event () at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4176
    #11 0x00005555556f4457 in handle_serial_event (err=0, client_data=0x0) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4514
    #12 0x0000555555820f56 in handle_file_event (file_ptr=0x607000000250, ready_mask=1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:573
    #13 0x0000555555821895 in gdb_wait_for_event (block=1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:694
    #14 0x000055555581f533 in gdb_do_one_event (mstimeout=-1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:264
    #15 0x00005555556ec9fb in start_event_loop () at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:3512
    #16 0x00005555556f0769 in captured_main (argc=4, argv=0x7fffffffe0d8) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:3992
    #17 0x00005555556f0e3f in main (argc=4, argv=0x7fffffffe0d8) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4078

The reason is a wrong current process when find_one_thread is called.
The current process is the 2nd one, which was just attached.  It does
not yet have thread_db data (proc->priv->thread_db is nullptr).  As we
iterate on all threads of all process to fulfull the qxfer:threads:read
request, we get to a thread of process 1 for which we haven't read
thread_db information yet (lwp_info::thread_known is false), so we get
into find_one_thread.  find_one_thread uses
`current_process ()->priv->thread_db`, assuming the current process
matches the ptid passed as a parameter, which is wrong.  A segfault
happens when trying to dereference that thread_db pointer.

Fix this by making find_one_thread not assume what the current process /
current thread is.  If it needs to call into libthread_db, which we know
will try to read memory from the current process, then temporarily set
the current process.

In the case where the thread is already know and we return early, we
don't need to switch process.

Add a test to reproduce this specific situation.

Change-Id: I09b00883e8b73b7e5f89d0f47cb4e9c0f3d6caaa
Approved-By: Andrew Burgess <aburgess@redhat.com>
tromey pushed a commit that referenced this issue Dec 14, 2022
This commit changes the target_stack class from using a C style array
of 'target_ops *' to using a C++ std::array<target_ops_ref, ...>.  The
benefit of this change is that some of the reference counting of
target_ops objects is now done automatically.

This commit fixes a crash in gdb.python/py-inferior.exp where GDB
crashes at exit, leaving a core file behind.

The crash occurs in connpy_connection_dealloc, and is actually
triggered by this assert:

gdb_assert (conn_obj->target == nullptr);

Now a little aside...

    ... the assert is never actually printed, instead GDB crashes due
    to calling a pure virtual function.  The backtrace at the point of
    crash looks like this:

      #7  0x00007fef7e2cf747 in std::terminate() () from /lib64/libstdc++.so.6
      #8  0x00007fef7e2d0515 in __cxa_pure_virtual () from /lib64/libstdc++.so.6
      #9  0x0000000000de334d in target_stack::find_beneath (this=0x4934d78, t=0x2bda270 <the_dummy_target>) at ../../s>
      #10 0x0000000000df4380 in inferior::find_target_beneath (this=0x4934b50, t=0x2bda270 <the_dummy_target>) at ../.>
      #11 0x0000000000de2381 in target_ops::beneath (this=0x2bda270 <the_dummy_target>) at ../../src/gdb/target.c:3047
      #12 0x0000000000de68aa in target_ops::supports_terminal_ours (this=0x2bda270 <the_dummy_target>) at ../../src/gd>
      #13 0x0000000000dde6b9 in target_supports_terminal_ours () at ../../src/gdb/target.c:1112
      #14 0x0000000000ee55f1 in internal_vproblem(internal_problem *, const char *, int, const char *, typedef __va_li>

    Notice in frame #12 we called target_ops::supports_terminal_ours,
    however, this is the_dummy_target, which is of type dummy_target,
    and so we should have called dummy_target::supports_terminal_ours.
    I believe the reason we ended up in the wrong implementation of
    supports_terminal_ours (which is a virtual function) is because we
    made the call during GDB's shut-down, and, I suspect, the vtables
    were in a weird state.

    Anyway, the point of this patch is not to fix GDB's ability to
    print an assert during exit, but to address the root cause of the
    assert.  With that aside out of the way, we can return to the main
    story...

Connections are represented in Python with gdb.TargetConnection
objects (or its sub-classes).  The assert in question confirms that
when a gdb.TargetConnection is deallocated, the underlying GDB
connection has itself been removed from GDB.  If this is not true then
we risk creating multiple different gdb.TargetConnection objects for
the same connection, which would be bad.

To ensure that we have one gdb.TargetConnection object for each
connection, the all_connection_objects map exists, this maps the
process_stratum_target object (the connection) to the
gdb.TargetConnection object that represents the connection.

When a connection is removed in GDB the connection_removed observer
fires, which we catch with connpy_connection_removed, this function
then sets conn_obj->target to nullptr, and removes the corresponding
entry from the all_connection_objects map.

The first issue here is that connpy_connection_dealloc is being called
as part of GDB's exit code, which is run after the Python interpreter
has been shut down.  The connpy_connection_dealloc function is used to
deallocate the gdb.TargetConnection Python object.  Surely it is
wrong for us to be deallocating Python objects after the interpreter
has been shut down.

The reason why connpy_connection_dealloc is called during GDB's exit
is that the global all_connection_objects map is still holding a
reference to the gdb.TargetConnection object.  When the map is
destroyed during GDB's exit, the gdb.TargetConnection objects within
the map can finally be deallocated.

The reason why all_connection_objects has contents when GDB exits, and
the reason the assert fires, is that, when GDB exits, there are still
some connections that have not yet been removed from GDB, that is,
they have a non-zero reference count.

If we take a look at quit_force (top.c) you can see that, for each
inferior, we call pop_all_targets before we (later in the function)
call do_final_cleanups.  It is the do_final_cleanups call that is
responsible for shutting down the Python interpreter.  The
pop_all_targets calls should, in theory, cause all the connections to
be removed from GDB.

That this isn't working indicates that some targets have a non-zero
reference count even after this final pop_all_targets call, and
indeed, when I debug GDB, that is what I see.

I tracked the problem down to delete_inferior where we do some house
keeping, and then delete the inferior object, which calls
inferior::~inferior.

In neither delete_inferior or inferior::~inferior do we call
pop_all_targets, and it is this missing call that means we leak some
references to the target_ops objects on the inferior's target_stack.

In this commit I will provide a partial fix for the problem.  I say
partial fix, but this will actually be enough to resolve the crash.
In a later commit I will provide the final part of the fix.

As mentioned at the start of the commit message, this commit changes
the m_stack in target_stack to hold target_ops_ref objects.  This
means that when inferior::~inferior is called, and m_stack is
released, we automatically decrement the target_ops reference count.
With this change in place we no longer leak any references, and now,
in quit_force the final pop_all_targets calls will release the final
references.  This means that the targets will be correctly closed at
this point, which means the connections will be removed from GDB and
the Python objects deallocated before the Python interpreter shuts
down.

There's a slight oddity in target_stack::unpush, where we std::move
the reference out of m_stack like this:

  auto ref = std::move (m_stack[stratum]);

the `ref' isn't used explicitly, but it serves to hold the
target_ops_ref until the end of the scope while allowing the m_stack
entry to be reset back to nullptr.  The alternative would be to
directly set the m_stack entry to nullptr, like this:

  m_stack[stratum] = nullptr;

The problem here is that when we set the m_stack entry to nullptr we
first decrement the target_ops reference count, and then set the array
entry to nullptr.

If the decrement means that the target_ops object reaches a zero
reference count then the target_ops object will be closed by calling
target_close.  In target_close we ensure that the target being closed
is not in any inferiors target_stack.

As we decrement before clearing, then this check in target_close will
fail, and an assert will trigger.

By using std::move to move the reference out of m_stack, this clears
the m_stack entry, meaning the inferior no longer contains the
target_ops in its target_stack.  Now when the REF object goes out of
scope and the reference count is decremented, target_close can run
successfully.

I've made use of the Python connection_removed listener API to add a
test for this issue.  The test installs a listener and then causes
delete_inferior to be called, we can then see that the connection is
then correctly removed (because the listener triggers).
tromey pushed a commit that referenced this issue Jan 6, 2023
I spotted a problem with scoped_debug_start_end's move constructor.
When constructing a scoped_debug_start_end through it, it doesn't
disable the moved-from object, meaning there are now two objects that
will do the side-effects of decrementing the debug_print_depth global
and printing the "end" message.  Decrementing the debug_print_depth
global twice is actually problematic, because the increments and
decrements get out of sync, meaning we should hit this assertion, in
theory:

    gdb_assert (debug_print_depth > 0);

However, in practice, we don't see that.  This is because despite the
move constructor being required for this to compile:

    template<typename PT>
    static inline scoped_debug_start_end<PT &> ATTRIBUTE_NULL_PRINTF (6, 7)
    make_scoped_debug_start_end (PT &&pred, const char *module, const char *func,
    			     const char *start_prefix,
    			     const char *end_prefix, const char *fmt, ...)
    {
      va_list args;
      va_start (args, fmt);
      auto res = scoped_debug_start_end<PT &> (pred, module, func, start_prefix,
    					   end_prefix, fmt, args);
      va_end (args);

      return res;
    }

... it is never actually called, because compilers elide the move
constructors all the way (the scoped_debug_start_end gets constructed
directly in the instance of the top-level caller).  To confirm this, I
built GDB with -fno-elide-constructors, and now I see it:

    /home/simark/src/binutils-gdb/gdb/../gdbsupport/common-debug.h:147: internal-error: ~scoped_debug_start_end: Assertion `debug_print_depth > 0' failed.

    #9  0x00005614ba5f17c3 in internal_error_loc (file=0x5614b8749960 "/home/simark/src/binutils-gdb/gdb/../gdbsupport/common-debug.h", line=147, fmt=0x5614b8733fa0 "%s: Assertion `%s' failed.") at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:58
    #10 0x00005614b8e1b2e5 in scoped_debug_start_end<bool&>::~scoped_debug_start_end (this=0x7ffc6c5e7b40, __in_chrg=<optimized out>) at /home/simark/src/binutils-gdb/gdb/../gdbsupport/common-debug.h:147
    #11 0x00005614b96dbe34 in make_scoped_debug_start_end<bool&> (pred=@0x5614baad7200: true, module=0x5614b891d840 "infrun", func=0x5614b891d800 "infrun_debug_show_threads", start_prefix=0x5614b891d7c0 "enter", end_prefix=0x5614b891d780 "exit", fmt=0x0) at /home/simark/src/binutils-gdb/gdb/../gdbsupport/common-debug.h:235

Fix this by adding an m_disabled field to scoped_debug_start_end, and
setting it in the move constructor.

Change-Id: Ie5213269c584837f751d2d11de831f45ae4a899f
tromey pushed a commit that referenced this issue Jan 21, 2023
I would like to improve frame_info_ptr to automatically grab the
information needed to reinflate a frame, and automatically reinflate it
as needed.  One thing that is in the way is the fact that some frames
can be created out of thin air by the create_new_frame function.  These
frames are not the fruit of unwinding from the target's current frame.
These frames are created by the "select-frame view" command.

These frames are not correctly handled by the frame save/restore
functions, save_selected_frame, restore_selected_frame and
lookup_selected_frame.  This can be observed here, using the test
included in this patch:

    $ ./gdb --data-directory=data-directory -nx -q testsuite/outputs/gdb.base/frame-view/frame-view
    Reading symbols from testsuite/outputs/gdb.base/frame-view/frame-view...
    (gdb) break thread_func
    Breakpoint 1 at 0x11a2: file /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/frame-view.c, line 42.
    (gdb) run
    Starting program: /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/frame-view/frame-view

    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1".
    [New Thread 0x7ffff7cc46c0 (LWP 4171134)]
    [Switching to Thread 0x7ffff7cc46c0 (LWP 4171134)]

    Thread 2 "frame-view" hit Breakpoint 1, thread_func (p=0x0) at /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/frame-view.c:42
    42        foo (11);
    (gdb) info frame
    Stack level 0, frame at 0x7ffff7cc3ee0:
     rip = 0x5555555551a2 in thread_func (/home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/frame-view.c:42); saved rip = 0x7ffff7d4e8fd
     called by frame at 0x7ffff7cc3f80
     source language c.
     Arglist at 0x7ffff7cc3ed0, args: p=0x0
     Locals at 0x7ffff7cc3ed0, Previous frame's sp is 0x7ffff7cc3ee0
     Saved registers:
      rbp at 0x7ffff7cc3ed0, rip at 0x7ffff7cc3ed8
    (gdb) thread 1
    [Switching to thread 1 (Thread 0x7ffff7cc5740 (LWP 4171122))]
    #0  0x00007ffff7d4b4b6 in ?? () from /usr/lib/libc.so.6

Here, we create a custom frame for thread 1 (using the stack from thread
2, for convenience):

    (gdb) select-frame view 0x7ffff7cc3f80 0x5555555551a2

The first calls to "frame" looks good:

    (gdb) frame
    #0  thread_func (p=0x7ffff7d4e630) at /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/frame-view.c:42
    42        foo (11);

But not the second one:

    (gdb) frame
    #0  0x00007ffff7d4b4b6 in ?? () from /usr/lib/libc.so.6

This second "frame" command shows the current target frame instead of
the user-created frame.

It's not totally clear how the "select-frame view" feature is expected
to behave, especially since it's not tested.  I heard accounts that it
used to be possible to select a frame like this and do "up" and "down"
to navigate the backtrace starting from that frame.  The fact that
create_new_frame calls frame_unwind_find_by_frame to install the right
unwinder suggest that it used to be possible.  But that doesn't work
today:

    (gdb) select-frame view 0x7ffff7cc3f80 0x5555555551a2
    (gdb) up
    Initial frame selected; you cannot go up.
    (gdb) down
    Bottom (innermost) frame selected; you cannot go down.

and "backtrace" always shows the actual thread's backtrace, it ignores
the user-created frame:

    (gdb) bt
    #0  0x00007ffff7d4b4b6 in ?? () from /usr/lib/libc.so.6
    #1  0x00007ffff7d50403 in ?? () from /usr/lib/libc.so.6
    #2  0x000055555555521a in main () at /home/simark/src/binutils-gdb/gdb/testsuite/gdb.base/frame-view.c:56

I don't want to address all the `select-frame view` issues , but I think
we can agree that the "frame" command changing the selected frame, as
shown above, is a bug.  I would expect that command to show the
currently selected frame and not change it.

This happens because of the scoped_restore_selected_frame object in
print_frame_args.  The frame information is saved in the constructor
(the backtrace below), and restored in the destructor.

    #0  save_selected_frame (frame_id=0x7ffdc0020ad0, frame_level=0x7ffdc0020af0) at /home/simark/src/binutils-gdb/gdb/frame.c:1682
    #1  0x00005631390242f0 in scoped_restore_selected_frame::scoped_restore_selected_frame (this=0x7ffdc0020ad0) at /home/simark/src/binutils-gdb/gdb/frame.c:324
    #2  0x000056313993581e in print_frame_args (fp_opts=..., func=0x62100023bde0, frame=..., num=-1, stream=0x60b000000300) at /home/simark/src/binutils-gdb/gdb/stack.c:755
    #3  0x000056313993ad49 in print_frame (fp_opts=..., frame=..., print_level=1, print_what=SRC_AND_LOC, print_args=1, sal=...) at /home/simark/src/binutils-gdb/gdb/stack.c:1401
    #4  0x000056313993835d in print_frame_info (fp_opts=..., frame=..., print_level=1, print_what=SRC_AND_LOC, print_args=1, set_current_sal=1) at /home/simark/src/binutils-gdb/gdb/stack.c:1126
    #5  0x0000563139932e0b in print_stack_frame (frame=..., print_level=1, print_what=SRC_AND_LOC, set_current_sal=1) at /home/simark/src/binutils-gdb/gdb/stack.c:368
    #6  0x0000563139932bbe in print_stack_frame_to_uiout (uiout=0x611000016840, frame=..., print_level=1, print_what=SRC_AND_LOC, set_current_sal=1) at /home/simark/src/binutils-gdb/gdb/stack.c:346
    #7  0x0000563139b0641e in print_selected_thread_frame (uiout=0x611000016840, selection=...) at /home/simark/src/binutils-gdb/gdb/thread.c:1993
    #8  0x0000563139940b7f in frame_command_core (fi=..., ignored=true) at /home/simark/src/binutils-gdb/gdb/stack.c:1871
    #9  0x000056313994db9e in frame_command_helper<frame_command_core>::base_command (arg=0x0, from_tty=1) at /home/simark/src/binutils-gdb/gdb/stack.c:1976

Since the user-created frame has level 0 (identified by the saved level
-1), lookup_selected_frame just reselects the target's current frame,
and the user-created frame is lost.

My goal here is to fix this particular problem.

Currently, select_frame does not set selected_frame_id and
selected_frame_level for frames with level 0.  It leaves them at
null_frame_id / -1, indicating to restore_selected_frame to use the
target's current frame.  User-created frames also have level 0, so add a
special case them such that select_frame saves their selected id and
level.

save_selected_frame does not need any change.

Change the assertion in restore_selected_frame that checks `frame_level
!= 0` to account for the fact that we can restore user-created frames,
which have level 0.

Finally, change lookup_selected_frame to make it able to re-create
user-created frame_info objects from selected_frame_level and
selected_frame_id.

Add a minimal test case for the case described above, that is the
"select-frame view" command followed by the "frame" command twice.  In
order to have a known stack frame to switch to, the test spawns a second
thread, and tells the first thread to use the other thread's top frame.

Change-Id: Ifc77848dc465fbd21324b9d44670833e09fe98c7
Reviewed-By: Bruno Larsen <blarsen@redhat.com>
tromey pushed a commit that referenced this issue Feb 9, 2023
…ames to the frame cache

The test gdb.base/frame-view.exp fails like this on AArch64:

    frame^M
    #0  baz (z1=hahaha, /home/simark/src/binutils-gdb/gdb/value.c:4056: internal-error: value_fetch_lazy_register: Assertion `next_frame != NULL' failed.^M
    A problem internal to GDB has been detected,^M
    further debugging may prove unreliable.^M
    FAIL: gdb.base/frame-view.exp: with_pretty_printer=true: frame (GDB internal error)

The sequence of events leading to this is the following:

 - When we create the user frame (the "select-frame view" command), we
   create a sentinel frame just for our user-created frame, in
   create_new_frame.  This sentinel frame has the same id as the regular
   sentinel frame.

 - When printing the frame, after doing the "select-frame view" command,
   the argument's pretty printer is invoked, which does an inferior
   function call (this is the point of the test).  This clears the frame
   cache, including the "real" sentinel frame, which sets the
   sentinel_frame global to nullptr.

 - Later in the frame-printing process (when printing the second
   argument), the auto-reinflation mechanism re-creates the user frame
   by calling create_new_frame again, creating its own special sentinel
   frame again.  However, note that the "real" sentinel frame, the
   sentinel_frame global, is still nullptr.  If the selected frame had
   been a regular frame, we would have called get_current_frame at some
   point during the reinflation, which would have re-created the "real"
   sentinel frame.  But it's not the case when reinflating a user frame.

 - Deep down the stack, something wants to fill in the unwind stop
   reason for frame 0, which requires trying to unwind frame 1.  This
   leads us to trying to unwind the PC of frame 1:

     #0  gdbarch_unwind_pc (gdbarch=0xffff8d010080, next_frame=...) at /home/simark/src/binutils-gdb/gdb/gdbarch.c:2955
     #1  0x000000000134569c in dwarf2_tailcall_sniffer_first (this_frame=..., tailcall_cachep=0xffff773fcae0, entry_cfa_sp_offsetp=0xfffff7f7d450)
         at /home/simark/src/binutils-gdb/gdb/dwarf2/frame-tailcall.c:390
     #2  0x0000000001355d84 in dwarf2_frame_cache (this_frame=..., this_cache=0xffff773fc928) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1089
     #3  0x00000000013562b0 in dwarf2_frame_unwind_stop_reason (this_frame=..., this_cache=0xffff773fc928) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1101
     #4  0x0000000001990f64 in get_prev_frame_always_1 (this_frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:2281
     #5  0x0000000001993034 in get_prev_frame_always (this_frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:2376
     #6  0x000000000199b814 in get_frame_unwind_stop_reason (frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:3051
     #7  0x0000000001359cd8 in dwarf2_frame_cfa (this_frame=...) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1356
     #8  0x000000000132122c in dwarf_expr_context::execute_stack_op (this=0xfffff7f80170, op_ptr=0xffff8d8883ee "\217\002", op_end=0xffff8d8883ee "\217\002")
         at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:2110
     #9  0x0000000001317b30 in dwarf_expr_context::eval (this=0xfffff7f80170, addr=0xffff8d8883ed "\234\217\002", len=1) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1239
     #10 0x000000000131d68c in dwarf_expr_context::execute_stack_op (this=0xfffff7f80170, op_ptr=0xffff8d88840e "", op_end=0xffff8d88840e "") at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1811
     #11 0x0000000001317b30 in dwarf_expr_context::eval (this=0xfffff7f80170, addr=0xffff8d88840c "\221p", len=2) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1239
     #12 0x0000000001314c3c in dwarf_expr_context::evaluate (this=0xfffff7f80170, addr=0xffff8d88840c "\221p", len=2, as_lval=true, per_cu=0xffff90b03700, frame=..., addr_info=0x0,
         type=0xffff8f6c8400, subobj_type=0xffff8f6c8400, subobj_offset=0) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1078
     #13 0x000000000149f9e0 in dwarf2_evaluate_loc_desc_full (type=0xffff8f6c8400, frame=..., data=0xffff8d88840c "\221p", size=2, per_cu=0xffff90b03700, per_objfile=0xffff9070b980,
         subobj_type=0xffff8f6c8400, subobj_byte_offset=0, as_lval=true) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1513
     #14 0x00000000014a0100 in dwarf2_evaluate_loc_desc (type=0xffff8f6c8400, frame=..., data=0xffff8d88840c "\221p", size=2, per_cu=0xffff90b03700, per_objfile=0xffff9070b980, as_lval=true)
         at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1557
     #15 0x00000000014aa584 in locexpr_read_variable (symbol=0xffff8f6cd770, frame=...) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:3052

 - AArch64 defines a special "prev register" function,
   aarch64_dwarf2_prev_register, to handle unwinding the PC.  This
   function does

     frame_unwind_register_unsigned (this_frame, AARCH64_LR_REGNUM);

 - frame_unwind_register_unsigned ultimately creates a lazy register
   value, saving the frame id of this_frame->next.  this_frame is the
   user-created frame, to this_frame->next is the special sentinel frame
   we created for it.  So the saved ID is the sentinel frame ID.

 - When time comes to un-lazify the value, value_fetch_lazy_register
   calls frame_find_by_id, to find the frame with the ID we saved.

 - frame_find_by_id sees it's the sentinel frame ID, so returns the
   sentinel_frame global, which is, if you remember, nullptr.

 - We hit the `gdb_assert (next_frame != NULL)` assertion in
   value_fetch_lazy_register.

The issues I see here are:

 - The ID of the sentinel frame created for the user-created frame is
   not distinguishable from the ID of the regular sentinel frame.  So
   there's no way frame_find_by_id could find the right frame, in
   value_fetch_lazy_register.
 - Even if they had distinguishable IDs, sentinel frames created for
   user frames are not registered anywhere, so there's no easy way
   frame_find_by_id could find it.

This patch addresses these two issues:

 - Give sentinel frames created for user frames their own distinct IDs
 - Register sentinel frames in the frame cache, so they can be found
   with frame_find_by_id.

I initially had this split in two patches, but I then found that it was
easier to explain as a single patch.

Rergarding the first part of the change: with this patch, the sentinel
frames created for user frames (in create_new_frame) still have
stack_status == FID_STACK_SENTINEL, but their code_addr and stack_addr
fields are now filled with the addresses used to create the user frame.
This ensures this sentinel frame ID is different from the "target"
sentinel frame ID, as well as any other "user" sentinel frame ID.  If
the user tries to create the same frame, with the same addresses,
multiple times, create_sentinel_frame just reuses the existing frame.
So we won't end up with multiple user sentinels with the same ID.

Regular "target" sentinel frames remain with code_addr and stack_addr
unset.

The concrete changes for that part are:

 - Remove the sentinel_frame_id constant, since there isn't one
   "sentinel frame ID" now.  Add the frame_id_build_sentinel function
   for building sentinel frame IDs and a is_sentinel_frame_id function
   to check if a frame id represents a sentinel frame.
 - Replace the sentinel_frame_id check in frame_find_by_id with a
   comparison to `frame_id_build_sentinel (0, 0)`.  The sentinel_frame
   global is meant to contain a reference to the "target" sentinel, so
   the one with addresses (0, 0).
 - Add stack and code address parameters to create_sentinel_frame, to be
   able to create the various types of sentinel frames.
 - Adjust get_current_frame to create the regular "target" sentinel.
 - Adjust create_new_frame to create a sentinel with the ID specific to
   the created user frame.
 - Adjust sentinel_frame_prev_register to get the sentinel frame ID from
   the frame_info object, since there isn't a single "sentinel frame ID"
   now.
 - Change get_next_frame_sentinel_okay to check for a
   sentinel-frame-id-like frame ID, rather than for sentinel_frame
   specifically, since this function could be called with another
   sentinel frame (and we would want the assert to catch it).

The rest of the change is about registering the sentinel frame in the
frame cache:

 - Change frame_stash_add's assertion to allow sentinel frame levels
   (-1).
 - Make create_sentinel_frame add the frame to the frame cache.
 - Change the "sentinel_frame != NULL" check in reinit_frame_cache for a
   check that the frame stash is not empty.  The idea is that if we only
   have some user-created frames in the cache when reinit_frame_cache is
   called, we probably want to emit the frames invalid annotation.  The
   goal of that check is to avoid unnecessary repeated annotations, I
   suppose, so the "frame cache not empty" check should achieve that.

After this change, I think we could theoritically get rid of the
sentienl_frame global.  That sentinel frame could always be found by
looking up `frame_id_build_sentinel (0, 0)` in the frame cache.
However, I left the global there to avoid slowing the typical case down
for nothing.  I however, noted in its comment that it is an
optimization.

With this fix applied, the gdb.base/frame-view.exp now passes for me on
AArch64.  value_of_register_lazy now saves the special sentinel frame ID
in the value, and value_fetch_lazy_register is able to find that
sentinel frame after the frame cache reinit and after the user-created
frame was reinflated.

Tested-By: Alexandra Petlanova Hajkova <ahajkova@redhat.com>
Tested-By: Luis Machado <luis.machado@arm.com>
Change-Id: I8b77b3448822c8aab3e1c3dda76ec434eb62704f
tromey pushed a commit that referenced this issue Feb 15, 2023
Tom de Vries reported [1] a regression in gdb.btrace/record_goto.exp
caused by 6d3717d ("gdb: call frame unwinders' dealloc_cache methods
through destroying the frame cache").  This issue is caught by ASan.  On
a non-ASan build, it may or may not cause a crash or some other issue, I
haven't tried.

I managed to narrow it down to:

    $ ./gdb -nx -q --data-directory=data-directory testsuite/outputs/gdb.btrace/record_goto/record_goto -ex "start" -ex "record btrace" -ex "next"

... and then doing repeatedly "record goto 19" and "record goto 27".
Eventually, I get:

    (gdb) record goto 27
    =================================================================
    ==1527735==ERROR: AddressSanitizer: heap-use-after-free on address 0x6210003392a8 at pc 0x55e4c26eef86 bp 0x7ffd229f24e0 sp 0x7ffd229f24d8
    READ of size 8 at 0x6210003392a8 thread T0
        #0 0x55e4c26eef85 in bfcache_eq /home/simark/src/binutils-gdb/gdb/record-btrace.c:1639
        #1 0x55e4c37cdeff in htab_find_slot_with_hash /home/simark/src/binutils-gdb/libiberty/hashtab.c:659
        #2 0x55e4c37ce24a in htab_find_slot /home/simark/src/binutils-gdb/libiberty/hashtab.c:703
        #3 0x55e4c26ef0c6 in bfcache_new /home/simark/src/binutils-gdb/gdb/record-btrace.c:1653
        #4 0x55e4c26f1242 in record_btrace_frame_sniffer /home/simark/src/binutils-gdb/gdb/record-btrace.c:1820
        #5 0x55e4c1b926a1 in frame_unwind_try_unwinder /home/simark/src/binutils-gdb/gdb/frame-unwind.c:136
        #6 0x55e4c1b930d7 in frame_unwind_find_by_frame(frame_info_ptr, void**) /home/simark/src/binutils-gdb/gdb/frame-unwind.c:196
        #7 0x55e4c1bb867f in get_frame_type(frame_info_ptr) /home/simark/src/binutils-gdb/gdb/frame.c:2925
        #8 0x55e4c2ae6798 in print_frame_info(frame_print_options const&, frame_info_ptr, int, print_what, int, int) /home/simark/src/binutils-gdb/gdb/stack.c:1049
        #9 0x55e4c2ade3e1 in print_stack_frame(frame_info_ptr, int, print_what, int) /home/simark/src/binutils-gdb/gdb/stack.c:367
        #10 0x55e4c26fda03 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2779
        #11 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843
        #12 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169
        #13 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372
        #14 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383

    0x6210003392a8 is located 424 bytes inside of 4064-byte region [0x621000339100,0x62100033a0e0)
    freed by thread T0 here:
        #0 0x7f6ca34a5b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
        #1 0x55e4c38a4c17 in rpl_free /home/simark/src/binutils-gdb/gnulib/import/free.c:44
        #2 0x55e4c1bbd378 in xfree<void> /home/simark/src/binutils-gdb/gdb/../gdbsupport/gdb-xfree.h:37
        #3 0x55e4c37d1b63 in call_freefun /home/simark/src/binutils-gdb/libiberty/obstack.c:103
        #4 0x55e4c37d25a2 in _obstack_free /home/simark/src/binutils-gdb/libiberty/obstack.c:280
        #5 0x55e4c1bad701 in reinit_frame_cache() /home/simark/src/binutils-gdb/gdb/frame.c:2112
        #6 0x55e4c27705a3 in registers_changed_ptid(process_stratum_target*, ptid_t) /home/simark/src/binutils-gdb/gdb/regcache.c:564
        #7 0x55e4c27708c7 in registers_changed_thread(thread_info*) /home/simark/src/binutils-gdb/gdb/regcache.c:573
        #8 0x55e4c26fd922 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2772
        #9 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843
        #10 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169
        #11 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372
        #12 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383

    previously allocated by thread T0 here:
        #0 0x7f6ca34a5e8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
        #1 0x55e4c0b55c60 in xmalloc /home/simark/src/binutils-gdb/gdb/alloc.c:57
        #2 0x55e4c37d1a6d in call_chunkfun /home/simark/src/binutils-gdb/libiberty/obstack.c:94
        #3 0x55e4c37d1c20 in _obstack_begin_worker /home/simark/src/binutils-gdb/libiberty/obstack.c:141
        #4 0x55e4c37d1ed7 in _obstack_begin /home/simark/src/binutils-gdb/libiberty/obstack.c:164
        #5 0x55e4c1bad728 in reinit_frame_cache() /home/simark/src/binutils-gdb/gdb/frame.c:2113
        #6 0x55e4c27705a3 in registers_changed_ptid(process_stratum_target*, ptid_t) /home/simark/src/binutils-gdb/gdb/regcache.c:564
        #7 0x55e4c27708c7 in registers_changed_thread(thread_info*) /home/simark/src/binutils-gdb/gdb/regcache.c:573
        #8 0x55e4c26fd922 in record_btrace_set_replay /home/simark/src/binutils-gdb/gdb/record-btrace.c:2772
        #9 0x55e4c26fddc3 in record_btrace_target::goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/record-btrace.c:2843
        #10 0x55e4c2de2bb2 in target_goto_record(unsigned long) /home/simark/src/binutils-gdb/gdb/target.c:4169
        #11 0x55e4c275ed98 in record_goto(char const*) /home/simark/src/binutils-gdb/gdb/record.c:372
        #12 0x55e4c275edba in cmd_record_goto /home/simark/src/binutils-gdb/gdb/record.c:383

The problem is a stale entry in the bfcache hash table (in
record-btrace.c), left across a reinit_frame_cache.  This entry points
to something that used to be allocated on the frame obstack, that has
since been wiped by reinit_frame_cache.

Before the aforementioned, unwinder deallocation functions were called
by iterating on the frame chain, starting with the sentinel frame, like
so:

  /* Tear down all frame caches.  */
  for (frame_info *fi = sentinel_frame; fi != NULL; fi = fi->prev)
    {
      if (fi->prologue_cache && fi->unwind->dealloc_cache)
	fi->unwind->dealloc_cache (fi, fi->prologue_cache);
      if (fi->base_cache && fi->base->unwind->dealloc_cache)
	fi->base->unwind->dealloc_cache (fi, fi->base_cache);
    }

After that patch, we relied on the fact that all frames are (supposedly)
in the frame_stash.  A deletion function was added to the frame_stash
hash table, so that dealloc functions would be called when emptying the
frame stash.  There is one case, however, where a frame_info is not in
the frame stash.  That is when we create the frame_info for the current
frame (level 0, unwound from the sentinel frame), but don't compute its
frame id.  The computation of the frame id for that frame (and only that
frame, AFAIK) is done lazily.  And putting a frame_info in the frame stash
requires knowing its id.  So a frame 0 whose frame id is not computed
yet is necessarily not in the frame stash.

When replaying with btrace, record_btrace_frame_sniffer insert entries
corresponding to frames in the "bfcache" hash table.  It then relies on
record_btrace_frame_dealloc_cache being called for each frame to remove
all those entries when the frames get invalidated.  If a frame reinit
happens while frame 0's id is not computed  (and therefore that frame is
not in frame_stash), record_btrace_frame_dealloc_cache does not get
called for it, and it leaves a stale entry in bfcache.  That then leads
to a use-after-free when that entry is accessed later, which ASan
catches.

The proposed solution is to explicitly call frame_info_del on frame 0,
if it exists, and if its frame id is not computed.  If its frame id is
computed, it is expected that it will be in the frame stash, so it will
be "deleted" through that.

[1] https://inbox.sourceware.org/gdb-patches/20230130200249.131155-1-simon.marchi@efficios.com/T/#mcf1340ce2906a72ec7ed535ec0c97dba11c3d977

Reported-By: Tom de Vries <tdevries@suse.de>
Tested-By: Tom de Vries <tdevries@suse.de>
Change-Id: I2351882dd511f3bbc01e4152e9db13b69b3ba384
tromey pushed a commit that referenced this issue Feb 15, 2023
I noticed that if Ctrl-C was typed just while GDB is evaluating a
breakpoint condition in the background, and GDB ends up reaching out
to the Python interpreter, then the breakpoint condition would still
fail, like:

  c&
  Continuing.
  (gdb) Error in testing breakpoint condition:
  Quit

That happens because while evaluating the breakpoint condition, we
enter Python, and end up calling PyErr_SetInterrupt (it's called by
gdbpy_set_quit_flag, in frame #0):

 (top-gdb) bt
 #0  gdbpy_set_quit_flag (extlang=0x558c68f81900 <extension_language_python>) at ../../src/gdb/python/python.c:288
 #1  0x0000558c6845f049 in set_quit_flag () at ../../src/gdb/extension.c:785
 #2  0x0000558c6845ef98 in set_active_ext_lang (now_active=0x558c68f81900 <extension_language_python>) at ../../src/gdb/extension.c:743
 #3  0x0000558c686d3e56 in gdbpy_enter::gdbpy_enter (this=0x7fff2b70bb90, gdbarch=0x558c6ab9eac0, language=0x0) at ../../src/gdb/python/python.c:212
 #4  0x0000558c68695d49 in python_on_memory_change (inferior=0x558c6a830b00, addr=0x555555558014, len=4, data=0x558c6af8a610 "") at ../../src/gdb/python/py-inferior.c:146
 #5  0x0000558c6823a071 in std::__invoke_impl<void, void (*&)(inferior*, unsigned long, long, unsigned char const*), inferior*, unsigned long, long, unsigned char const*> (__f=@0x558c6a8ecd98: 0x558c68695d01 <python_on_memory_change(inferior*, CORE_ADDR, ssize_t, bfd_byte const*)>) at /usr/include/c++/11/bits/invoke.h:61
 #6  0x0000558c68237591 in std::__invoke_r<void, void (*&)(inferior*, unsigned long, long, unsigned char const*), inferior*, unsigned long, long, unsigned char const*> (__fn=@0x558c6a8ecd98: 0x558c68695d01 <python_on_memory_change(inferior*, CORE_ADDR, ssize_t, bfd_byte const*)>) at /usr/include/c++/11/bits/invoke.h:111
 #7  0x0000558c68233e64 in std::_Function_handler<void (inferior*, unsigned long, long, unsigned char const*), void (*)(inferior*, unsigned long, long, unsigned char const*)>::_M_invoke(std::_Any_data const&, inferior*&&, unsigned long&&, long&&, unsigned char const*&&) (__functor=..., __args#0=@0x7fff2b70bd40: 0x558c6a830b00, __args#1=@0x7fff2b70bd38: 93824992247828, __args#2=@0x7fff2b70bd30: 4, __args#3=@0x7fff2b70bd28: 0x558c6af8a610 "") at /usr/include/c++/11/bits/std_function.h:290
 #8  0x0000558c6830a96e in std::function<void (inferior*, unsigned long, long, unsigned char const*)>::operator()(inferior*, unsigned long, long, unsigned char const*) const (this=0x558c6a8ecd98, __args#0=0x558c6a830b00, __args#1=93824992247828, __args#2=4, __args#3=0x558c6af8a610 "") at /usr/include/c++/11/bits/std_function.h:590
 #9  0x0000558c6830a620 in gdb::observers::observable<inferior*, unsigned long, long, unsigned char const*>::notify (this=0x558c690828c0 <gdb::observers::memory_changed>, args#0=0x558c6a830b00, args#1=93824992247828, args#2=4, args#3=0x558c6af8a610 "") at ../../src/gdb/../gdbsupport/observable.h:166
 #10 0x0000558c68309d95 in write_memory_with_notification (memaddr=0x555555558014, myaddr=0x558c6af8a610 "", len=4) at ../../src/gdb/corefile.c:363
 #11 0x0000558c68904224 in value_assign (toval=0x558c6afce910, fromval=0x558c6afba6c0) at ../../src/gdb/valops.c:1190
 #12 0x0000558c681e3869 in expr::assign_operation::evaluate (this=0x558c6af8e150, expect_type=0x0, exp=0x558c6afcfe60, noside=EVAL_NORMAL) at ../../src/gdb/expop.h:1902
 #13 0x0000558c68450c89 in expr::logical_or_operation::evaluate (this=0x558c6afab060, expect_type=0x0, exp=0x558c6afcfe60, noside=EVAL_NORMAL) at ../../src/gdb/eval.c:2330
 #14 0x0000558c6844a896 in expression::evaluate (this=0x558c6afcfe60, expect_type=0x0, noside=EVAL_NORMAL) at ../../src/gdb/eval.c:110
 #15 0x0000558c6844a95e in evaluate_expression (exp=0x558c6afcfe60, expect_type=0x0) at ../../src/gdb/eval.c:124
 #16 0x0000558c682061ef in breakpoint_cond_eval (exp=0x558c6afcfe60) at ../../src/gdb/breakpoint.c:4971
 ...


The fix is to disable cooperative SIGINT handling while handling
inferior events, so that SIGINT is saved in the global quit flag, and
not in the extension language, while handling an event.

This commit augments the testcase added by the previous commit to test
this scenario as well.

Approved-By: Tom Tromey <tom@tromey.com>
Change-Id: Idf8ab815774ee6f4b45ca2d0caaf30c9b9f127bb
tromey pushed a commit that referenced this issue Feb 21, 2023
…l/kernel mode addresses

At the moment GDB only handles pointer authentication (pauth) for userspace
addresses and if we're debugging a Linux-hosted program.

The Linux Kernel can be configured to use pauth instructions for some
additional security hardening, but GDB doesn't handle this well.

To overcome this limitation, GDB needs a couple things:

1 - The target needs to advertise pauth support.
2 - The hook to remove non-address bits from a pointer needs to be registered
    in aarch64-tdep.c as opposed to aarch64-linux-tdep.c.

There is a patch for QEMU that addresses the first point, and it makes
QEMU's gdbstub expose a couple more pauth mask registers, so overall we will
have up to 4 pauth masks (2 masks or 4 masks):

pauth_dmask
pauth_cmask
pauth_dmask_high
pauth_cmask_high

pauth_dmask and pauth_cmask are the masks used to remove pauth signatures
from userspace addresses. pauth_dmask_high and pauth_cmask_high masks are used
to remove pauth signatures from kernel addresses.

The second point is easily addressed by moving code around.

When debugging a Linux Kernel built with pauth with an unpatched GDB, we get
the following backtrace:

 #0  __fput (file=0xffff0000c17a6400) at /repos/linux/fs/file_table.c:296
 #1  0xffff8000082bd1f0 in ____fput (work=<optimized out>) at /repos/linux/fs/file_table.c:348
 #2  0x30008000080ade30 [PAC] in ?? ()
 #3  0x30d48000080ade30 in ?? ()
 Backtrace stopped: previous frame identical to this frame (corrupt stack?)

With a patched GDB, we get something a lot more meaningful:

 #0  __fput (file=0xffff0000c1bcfa00) at /repos/linux/fs/file_table.c:296
 #1  0xffff8000082bd1f0 in ____fput (work=<optimized out>) at /repos/linux/fs/file_table.c:348
 #2  0xffff8000080ade30 [PAC] in task_work_run () at /repos/linux/kernel/task_work.c:179
 #3  0xffff80000801db90 [PAC] in resume_user_mode_work (regs=0xffff80000a96beb0) at /repos/linux/include/linux/resume_user_mode.h:49
 #4  do_notify_resume (regs=regs@entry=0xffff80000a96beb0, thread_flags=4) at /repos/linux/arch/arm64/kernel/signal.c:1127
 #5  0xffff800008fb9974 [PAC] in prepare_exit_to_user_mode (regs=0xffff80000a96beb0) at /repos/linux/arch/arm64/kernel/entry-common.c:137
 #6  exit_to_user_mode (regs=0xffff80000a96beb0) at /repos/linux/arch/arm64/kernel/entry-common.c:142
 #7  el0_svc (regs=0xffff80000a96beb0) at /repos/linux/arch/arm64/kernel/entry-common.c:638
 #8  0xffff800008fb9d34 [PAC] in el0t_64_sync_handler (regs=<optimized out>) at /repos/linux/arch/arm64/kernel/entry-common.c:655
 #9  0xffff800008011548 [PAC] in el0t_64_sync () at /repos/linux/arch/arm64/kernel/entry.S:586
 Backtrace stopped: Cannot access memory at address 0xffff80000a96c0c8
tromey pushed a commit that referenced this issue Mar 20, 2023
In some cases GDB will fail when attempting to complete a command that
involves a rust symbol, the failure can manifest as a crash.

The problem is caused by the completion_match_for_lcd object being
left containing invalid data during calls to cp_symbol_name_matches_1.

The first question to address is why we are calling a C++ support
function when handling a rust symbol.  That's due to GDB's auto
language detection for msymbols, in some cases GDB can't tell if a
symbol is a rust symbol, or a C++ symbol.

The test application contains symbols for functions which are
statically linked in from various rust support libraries.  There's no
DWARF for these symbols, so all GDB has is the msymbols built from the
ELF symbol table.

Here's the problematic symbol that leads to our crash:

    mangled: _ZN4core3str21_$LT$impl$u20$str$GT$5parse17h5111d2d6a50d22bdE
  demangled: core::str::<impl str>::parse

As an msymbol this is initially created with language auto, then GDB
eventually calls symbol_find_demangled_name, which loops over all
languages calling language_defn::sniff_from_mangled_name, the first
language that can demangle the symbol gets assigned as the language
for that symbol.

Unfortunately, there's overlap in the mangled symbol names,
some (legacy) rust symbols can be demangled as both rust and C++, see
cplus_demangle in libiberty/cplus-dem.c where this is mentioned.

And so, because we check the C++ language before we check for rust,
then the msymbol is (incorrectly) given the C++ language.

Now it's true that is some cases we might be able to figure out that a
demangled symbol is not actually a valid C++ symbol, for example, in
our case, the construct '::<impl str>::' is not, I believe, valid in a
C++ symbol, we could look for ':<' and '>:' and refuse to accept this
as a C++ symbol.

However, I'm not sure it is always possible to tell that a demangled
symbol is rust or C++, so, I think, we have to accept that some times
we will get this language detection wrong.

If we accept that we can't fix the symbol language detection 100% of
the time, then we should make sure that GDB doesn't crash when it gets
the language wrong, that is what this commit addresses.

In our test case the user tries to complete a symbol name like this:

  (gdb) complete break pars

This results in GDB trying to find all symbols that match 'pars',
eventually we consider our problematic symbol, and we end up with a
call stack that looks like this:

  #0  0x0000000000f3c6bd in strncmp_iw_with_mode
  #1  0x0000000000706d8d in cp_symbol_name_matches_1
  #2  0x0000000000706fa4 in cp_symbol_name_matches
  #3  0x0000000000df3c45 in compare_symbol_name
  #4  0x0000000000df3c91 in completion_list_add_name
  #5  0x0000000000df3f1d in completion_list_add_msymbol
  #6  0x0000000000df4c94 in default_collect_symbol_completion_matches_break_on
  #7  0x0000000000658c08 in language_defn::collect_symbol_completion_matches
  #8  0x0000000000df54c9 in collect_symbol_completion_matches
  #9  0x00000000009d98fb in linespec_complete_function
  #10 0x00000000009d99f0 in complete_linespec_component
  #11 0x00000000009da200 in linespec_complete
  #12 0x00000000006e4132 in complete_address_and_linespec_locations
  #13 0x00000000006e4ac3 in location_completer

In cp_symbol_name_matches_1 we enter a loop, this loop repeatedly
tries to match the demangled problematic symbol name against the user
supplied text ('pars').  Each time around the loop another component
of the symbol name is stripped off, thus, we check 'pars' against
these options:

  core::str::<impl str>::parse
  str::<impl str>::parse
  <impl str>::parse
  parse

As soon as we get a match the cp_symbol_name_matches_1 exits its loop
and returns.  In our case, when we're looking for 'pars', the match
occurs on the last iteration of the loop, when we are comparing to
'parse'.

Now the problem here is that cp_symbol_name_matches_1 uses the
strncmp_iw_with_mode, and inside strncmp_iw_with_mode we allow for
skipping over template parameters.  This allows GDB to match the
symbol name 'foo<int>(int,int)' if the user supplies 'foo(int,'.
Inside strncmp_iw_with_mode GDB will record any template arguments
that it has skipped over inside the completion_match_for_lcd object
that is passed in as an argument.

And so, when GDB tries to match against '<impl str>::parse', the first
thing it sees is '<impl str>', GDB assumes this is a template argument
and records this as a skipped region within the
completion_match_for_lcd object.  After '<impl str>' GDB sees a ':'
character, which doesn't match with the 'pars' the user supplied, so
strncmp_iw_with_mode returns a value indicating a non-match.  GDB then
removes the '<impl str>' component from the symbol name and tries
again, this time comparing to 'parse', which does match.

Having found a match, then in cp_symbol_name_matches_1 we record the
match string, and the full symbol name within the
completion_match_result object, and return.

The problem here is that the skipped region, the '<impl str>' that we
recorded in the penultimate loop iteration was never discarded, its
still there in our returned result.

If we look at what the pointers held in the completion_match_result
that cp_symbol_name_matches_1 returns, this is what we see:

  core::str::<impl str>::parse
  |          \________/  |
  |               |      '--- completion match string
  |               '---skip range
  '--- full symbol name

When GDB calls completion_match_for_lcd::finish, GDB tries to create a
string using the completion match string (parse), but excluding the
skip range, as the stored skip range is before the start of the
completion match string, then GDB tries to do some weird string
creation, which will cause GDB to crash.

The reason we don't often see this problem in C++ is that for C++
symbols there is always some non-template text before the template
argument.  This non-template text means GDB is likely to either match
the symbol, or reject the symbol without storing a skip range.

However, notice, I did say, we don't often see this problem.  Once I
understood the issue, I was able to reproduce the crash using a pure
C++ example:

  template<typename S>
  struct foo
  {
    template<typename T>
    foo (int p1, T a)
    {
      s = 0;
    }

    S s;
  };

  int
  main ()
  {
    foo<int> obj (2.3, 0);
    return 0;
  }

Then in GDB:

  (gdb) complete break foo(int

The problem here is that the C++ symbol for the constructor looks like
this:

  foo<int>::foo<double>(int, double)

When GDB enters cp_symbol_name_matches_1 the symbols it examines are:

  foo<int>::foo<double>(int, double)
  foo<double>(int, double)

The first iteration of the loop will match the 'foo', then add the
'<int>' template argument will be added as a skip range.  When GDB
find the ':' after the '<int>' the first iteration of the loop fails
to match, GDB removes the 'foo<int>::' component, and starts the
second iteration of the loop.

Again, GDB matches the 'foo', and now adds '<double>' as a skip
region.  After that the '(int' successfully matches, and so the second
iteration of the loop succeeds, but, once again we left the '<int>' in
place as a skip region, even though this occurs before the start of
our match string, and this will cause GDB to crash.

This problem was reported to the mailing list, and a solution
discussed in this thread:

  https://sourceware.org/pipermail/gdb-patches/2023-January/195166.html

The solution proposed here is similar to one proposed by the original
bug reported, but implemented in a different location within GDB.
Instead of placing the fix in strncmp_iw_with_mode, I place the fix in
cp_symbol_name_matches_1.  I believe this is a better location as it
is this function that implements the loop, and it is this loop, which
repeatedly calls strncmp_iw_with_mode, that should be resetting the
result object state (I believe).

What I have done is add an assert to strncmp_iw_with_mode that the
incoming result object is empty.

I've also added some other asserts in related code, in
completion_match_for_lcd::mark_ignored_range, I make some basic
assertions about the incoming range pointers, and in
completion_match_for_lcd::finish I also make some assertions about how
the skip ranges relate to the match pointer.

There's two new tests.  The original rust example that was used in the
initial bug report, and a C++ test.  The rust example depends on which
symbols are pulled in from the rust libraries, so it is possible that,
at some future date, the problematic symbol will disappear from this
test program.  The C++ test should be more reliable, as this only
depends on symbols from within the C++ source code.

Since I originally posted this patch to the mailing list, the
following patch has been merged:

  commit 6e7eef7
  Date:   Sun Mar 19 09:13:10 2023 -0600

      Use rust_demangle to fix a crash

This solves the problem of a rust symbol ending up in the C++ specific
code by changing the order languages are sorted.  However, this new
commit doesn't address the issue in the C++ code which was fixed with
this commit.

Given that the C++ issue is real, and has a reproducer, I'm still
going to merge this fix.  I've left the discussion of rust in this
commit message as I originally wrote it, but it should be read within
the context of GDB prior to commit 6e7eef7.

Co-Authored-By:  Zheng Zhan <zzlossdev@163.com>
tromey pushed a commit that referenced this issue May 10, 2023
Commit 7a8de0c ("Remove ALL_BREAKPOINTS_SAFE") introduced a
use-after-free in the breakpoints iterations (see below for full ASan
report).  This makes gdb.base/stale-infcall.exp fail when GDB is build
with ASan.

check_longjmp_breakpoint_for_call_dummy iterates on all breakpoints,
possibly deleting the current breakpoint as well as related breakpoints.
The problem arises when a breakpoint in the B->related_breakpoint chain
is also B->next.  In that case, deleting that related breakpoint frees
the breakpoint that all_breakpoints_safe has saved.

The old code worked around that by manually changing B_TMP, which was
the next breakpoint saved by the "safe iterator":

	while (b->related_breakpoint != b)
	  {
	    if (b_tmp == b->related_breakpoint)
	      b_tmp = b->related_breakpoint->next;
	    delete_breakpoint (b->related_breakpoint);
	  }

(Note that this seemed to assume that b->related_breakpoint->next was
the same as b->next->next, not sure this is guaranteed.)

The new code kept the B_TMP variable, but it's not useful in that
context.  We can't go change the next breakpoint as saved by the safe
iterator, like we did before.  I suggest fixing that by saving the
breakpoints to delete in a map and deleting them all at the end.

Here's the full ASan report:

    (gdb) PASS: gdb.base/stale-infcall.exp: continue to breakpoint: break-run1
    print infcall ()
    =================================================================
    ==47472==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000034980 at pc 0x563f7012c7bc bp 0x7ffdf3804d70 sp 0x7ffdf3804d60
    READ of size 8 at 0x611000034980 thread T0
        #0 0x563f7012c7bb in next_iterator<breakpoint>::operator++() /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/next-iterator.h:66
        #1 0x563f702ce8c0 in basic_safe_iterator<next_iterator<breakpoint> >::operator++() /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/safe-iterator.h:84
        #2 0x563f7021522a in check_longjmp_breakpoint_for_call_dummy(thread_info*) /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:7611
        #3 0x563f714567b1 in process_event_stop_test /home/smarchi/src/binutils-gdb/gdb/infrun.c:6881
        #4 0x563f71454e07 in handle_signal_stop /home/smarchi/src/binutils-gdb/gdb/infrun.c:6769
        #5 0x563f7144b680 in handle_inferior_event /home/smarchi/src/binutils-gdb/gdb/infrun.c:6023
        #6 0x563f71436165 in fetch_inferior_event() /home/smarchi/src/binutils-gdb/gdb/infrun.c:4387
        #7 0x563f7136ff51 in inferior_event_handler(inferior_event_type) /home/smarchi/src/binutils-gdb/gdb/inf-loop.c:42
        #8 0x563f7168038d in handle_target_event /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:4219
        #9 0x563f72fccb6d in handle_file_event /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:573
        #10 0x563f72fcd503 in gdb_wait_for_event /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:694
        #11 0x563f72fcaf2b in gdb_do_one_event(int) /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:217
        #12 0x563f7262b9bb in wait_sync_command_done() /home/smarchi/src/binutils-gdb/gdb/top.c:426
        #13 0x563f7137a7c3 in run_inferior_call /home/smarchi/src/binutils-gdb/gdb/infcall.c:650
        #14 0x563f71381295 in call_function_by_hand_dummy(value*, type*, gdb::array_view<value*>, void (*)(void*, int), void*) /home/smarchi/src/binutils-gdb/gdb/infcall.c:1332
        #15 0x563f7137c0e2 in call_function_by_hand(value*, type*, gdb::array_view<value*>) /home/smarchi/src/binutils-gdb/gdb/infcall.c:780
        #16 0x563f70fe5960 in evaluate_subexp_do_call(expression*, noside, value*, gdb::array_view<value*>, char const*, type*) /home/smarchi/src/binutils-gdb/gdb/eval.c:649
        #17 0x563f70fe6617 in expr::operation::evaluate_funcall(type*, expression*, noside, char const*, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/eval.c:677
        #18 0x563f6fd19668 in expr::operation::evaluate_funcall(type*, expression*, noside, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/expression.h:136
        #19 0x563f70fe6bba in expr::var_value_operation::evaluate_funcall(type*, expression*, noside, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/eval.c:689
        #20 0x563f704b71dc in expr::funcall_operation::evaluate(type*, expression*, noside) /home/smarchi/src/binutils-gdb/gdb/expop.h:2219
        #21 0x563f70fe0f02 in expression::evaluate(type*, noside) /home/smarchi/src/binutils-gdb/gdb/eval.c:110
        #22 0x563f71b1373e in process_print_command_args /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1319
        #23 0x563f71b1391b in print_command_1 /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1332
        #24 0x563f71b147ec in print_command /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1465
        #25 0x563f706029b8 in do_simple_func /home/smarchi/src/binutils-gdb/gdb/cli/cli-decode.c:95
        #26 0x563f7061972a in cmd_func(cmd_list_element*, char const*, int) /home/smarchi/src/binutils-gdb/gdb/cli/cli-decode.c:2735
        #27 0x563f7262d0ef in execute_command(char const*, int) /home/smarchi/src/binutils-gdb/gdb/top.c:572
        #28 0x563f7100ed9c in command_handler(char const*) /home/smarchi/src/binutils-gdb/gdb/event-top.c:543
        #29 0x563f7101014b in command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) /home/smarchi/src/binutils-gdb/gdb/event-top.c:779
        #30 0x563f72777942 in tui_command_line_handler /home/smarchi/src/binutils-gdb/gdb/tui/tui-interp.c:104
        #31 0x563f7100d059 in gdb_rl_callback_handler /home/smarchi/src/binutils-gdb/gdb/event-top.c:250
        #32 0x7f5a80418246 in rl_callback_read_char (/usr/lib/libreadline.so.8+0x3b246) (BuildId: 092e91fc4361b0ef94561e3ae03a75f69398acbb)
        #33 0x563f7100ca06 in gdb_rl_callback_read_char_wrapper_noexcept /home/smarchi/src/binutils-gdb/gdb/event-top.c:192
        #34 0x563f7100cc5e in gdb_rl_callback_read_char_wrapper /home/smarchi/src/binutils-gdb/gdb/event-top.c:225
        #35 0x563f728c70db in stdin_event_handler /home/smarchi/src/binutils-gdb/gdb/ui.c:155
        #36 0x563f72fccb6d in handle_file_event /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:573
        #37 0x563f72fcd503 in gdb_wait_for_event /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:694
        #38 0x563f72fcb15c in gdb_do_one_event(int) /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:264
        #39 0x563f7177ec1c in start_event_loop /home/smarchi/src/binutils-gdb/gdb/main.c:412
        #40 0x563f7177f12e in captured_command_loop /home/smarchi/src/binutils-gdb/gdb/main.c:476
        #41 0x563f717846e4 in captured_main /home/smarchi/src/binutils-gdb/gdb/main.c:1320
        #42 0x563f71784821 in gdb_main(captured_main_args*) /home/smarchi/src/binutils-gdb/gdb/main.c:1339
        #43 0x563f6fcedfbd in main /home/smarchi/src/binutils-gdb/gdb/gdb.c:32
        #44 0x7f5a7e43984f  (/usr/lib/libc.so.6+0x2384f) (BuildId: 2f005a79cd1a8e385972f5a102f16adba414d75e)
        #45 0x7f5a7e439909 in __libc_start_main (/usr/lib/libc.so.6+0x23909) (BuildId: 2f005a79cd1a8e385972f5a102f16adba414d75e)
        #46 0x563f6fcedd84 in _start (/home/smarchi/build/binutils-gdb/gdb/gdb+0xafb0d84) (BuildId: 50bd32e6e9d5e84543e9897b8faca34858ca3995)

    0x611000034980 is located 0 bytes inside of 208-byte region [0x611000034980,0x611000034a50)
    freed by thread T0 here:
        #0 0x7f5a7fce312a in operator delete(void*, unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:164
        #1 0x563f702bd1fa in momentary_breakpoint::~momentary_breakpoint() /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:304
        #2 0x563f702771c5 in delete_breakpoint(breakpoint*) /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:12404
        #3 0x563f702150a7 in check_longjmp_breakpoint_for_call_dummy(thread_info*) /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:7673
        #4 0x563f714567b1 in process_event_stop_test /home/smarchi/src/binutils-gdb/gdb/infrun.c:6881
        #5 0x563f71454e07 in handle_signal_stop /home/smarchi/src/binutils-gdb/gdb/infrun.c:6769
        #6 0x563f7144b680 in handle_inferior_event /home/smarchi/src/binutils-gdb/gdb/infrun.c:6023
        #7 0x563f71436165 in fetch_inferior_event() /home/smarchi/src/binutils-gdb/gdb/infrun.c:4387
        #8 0x563f7136ff51 in inferior_event_handler(inferior_event_type) /home/smarchi/src/binutils-gdb/gdb/inf-loop.c:42
        #9 0x563f7168038d in handle_target_event /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:4219
        #10 0x563f72fccb6d in handle_file_event /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:573
        #11 0x563f72fcd503 in gdb_wait_for_event /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:694
        #12 0x563f72fcaf2b in gdb_do_one_event(int) /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:217
        #13 0x563f7262b9bb in wait_sync_command_done() /home/smarchi/src/binutils-gdb/gdb/top.c:426
        #14 0x563f7137a7c3 in run_inferior_call /home/smarchi/src/binutils-gdb/gdb/infcall.c:650
        #15 0x563f71381295 in call_function_by_hand_dummy(value*, type*, gdb::array_view<value*>, void (*)(void*, int), void*) /home/smarchi/src/binutils-gdb/gdb/infcall.c:1332
        #16 0x563f7137c0e2 in call_function_by_hand(value*, type*, gdb::array_view<value*>) /home/smarchi/src/binutils-gdb/gdb/infcall.c:780
        #17 0x563f70fe5960 in evaluate_subexp_do_call(expression*, noside, value*, gdb::array_view<value*>, char const*, type*) /home/smarchi/src/binutils-gdb/gdb/eval.c:649
        #18 0x563f70fe6617 in expr::operation::evaluate_funcall(type*, expression*, noside, char const*, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/eval.c:677
        #19 0x563f6fd19668 in expr::operation::evaluate_funcall(type*, expression*, noside, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/expression.h:136
        #20 0x563f70fe6bba in expr::var_value_operation::evaluate_funcall(type*, expression*, noside, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/eval.c:689
        #21 0x563f704b71dc in expr::funcall_operation::evaluate(type*, expression*, noside) /home/smarchi/src/binutils-gdb/gdb/expop.h:2219
        #22 0x563f70fe0f02 in expression::evaluate(type*, noside) /home/smarchi/src/binutils-gdb/gdb/eval.c:110
        #23 0x563f71b1373e in process_print_command_args /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1319
        #24 0x563f71b1391b in print_command_1 /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1332
        #25 0x563f71b147ec in print_command /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1465
        #26 0x563f706029b8 in do_simple_func /home/smarchi/src/binutils-gdb/gdb/cli/cli-decode.c:95
        #27 0x563f7061972a in cmd_func(cmd_list_element*, char const*, int) /home/smarchi/src/binutils-gdb/gdb/cli/cli-decode.c:2735
        #28 0x563f7262d0ef in execute_command(char const*, int) /home/smarchi/src/binutils-gdb/gdb/top.c:572
        #29 0x563f7100ed9c in command_handler(char const*) /home/smarchi/src/binutils-gdb/gdb/event-top.c:543

    previously allocated by thread T0 here:
        #0 0x7f5a7fce2012 in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:95
        #1 0x563f7029a9a3 in new_momentary_breakpoint<program_space*&, frame_id&, int&> /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:8129
        #2 0x563f702212f6 in momentary_breakpoint_from_master /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:8169
        #3 0x563f70212db1 in set_longjmp_breakpoint_for_call_dummy() /home/smarchi/src/binutils-gdb/gdb/breakpoint.c:7582
        #4 0x563f713804db in call_function_by_hand_dummy(value*, type*, gdb::array_view<value*>, void (*)(void*, int), void*) /home/smarchi/src/binutils-gdb/gdb/infcall.c:1260
        #5 0x563f7137c0e2 in call_function_by_hand(value*, type*, gdb::array_view<value*>) /home/smarchi/src/binutils-gdb/gdb/infcall.c:780
        #6 0x563f70fe5960 in evaluate_subexp_do_call(expression*, noside, value*, gdb::array_view<value*>, char const*, type*) /home/smarchi/src/binutils-gdb/gdb/eval.c:649
        #7 0x563f70fe6617 in expr::operation::evaluate_funcall(type*, expression*, noside, char const*, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/eval.c:677
        #8 0x563f6fd19668 in expr::operation::evaluate_funcall(type*, expression*, noside, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/expression.h:136
        #9 0x563f70fe6bba in expr::var_value_operation::evaluate_funcall(type*, expression*, noside, std::__debug::vector<std::unique_ptr<expr::operation, std::default_delete<expr::operation> >, std::allocator<std::unique_ptr<expr::operation, std::default_delete<expr::operation> > > > const&) /home/smarchi/src/binutils-gdb/gdb/eval.c:689
        #10 0x563f704b71dc in expr::funcall_operation::evaluate(type*, expression*, noside) /home/smarchi/src/binutils-gdb/gdb/expop.h:2219
        #11 0x563f70fe0f02 in expression::evaluate(type*, noside) /home/smarchi/src/binutils-gdb/gdb/eval.c:110
        #12 0x563f71b1373e in process_print_command_args /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1319
        #13 0x563f71b1391b in print_command_1 /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1332
        #14 0x563f71b147ec in print_command /home/smarchi/src/binutils-gdb/gdb/printcmd.c:1465
        #15 0x563f706029b8 in do_simple_func /home/smarchi/src/binutils-gdb/gdb/cli/cli-decode.c:95
        #16 0x563f7061972a in cmd_func(cmd_list_element*, char const*, int) /home/smarchi/src/binutils-gdb/gdb/cli/cli-decode.c:2735
        #17 0x563f7262d0ef in execute_command(char const*, int) /home/smarchi/src/binutils-gdb/gdb/top.c:572
        #18 0x563f7100ed9c in command_handler(char const*) /home/smarchi/src/binutils-gdb/gdb/event-top.c:543
        #19 0x563f7101014b in command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) /home/smarchi/src/binutils-gdb/gdb/event-top.c:779
        #20 0x563f72777942 in tui_command_line_handler /home/smarchi/src/binutils-gdb/gdb/tui/tui-interp.c:104
        #21 0x563f7100d059 in gdb_rl_callback_handler /home/smarchi/src/binutils-gdb/gdb/event-top.c:250
        #22 0x7f5a80418246 in rl_callback_read_char (/usr/lib/libreadline.so.8+0x3b246) (BuildId: 092e91fc4361b0ef94561e3ae03a75f69398acbb)

Change-Id: Id00c17ab677f847fbf4efdf0f4038373668d3d88
Approved-By: Tom Tromey <tom@tromey.com>
tromey pushed a commit that referenced this issue Jun 9, 2023
After this commit:

  commit baab375
  Date:   Tue Jul 13 14:44:27 2021 -0400

      gdb: building inferior strings from within GDB

It was pointed out that a new ASan failure had been introduced which
was triggered by gdb.base/internal-string-values.exp:

  (gdb) PASS: gdb.base/internal-string-values.exp: test_setting: all langs: lang=ada: ptype "foo"
  print $_gdb_maint_setting("test-settings string")
  =================================================================
  ==80377==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000068034 at pc 0x564785cba682 bp 0x7ffd20644620 sp 0x7ffd20644610
  READ of size 1 at 0x603000068034 thread T0
      #0 0x564785cba681 in find_command_name_length(char const*) /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:2129
      #1 0x564785cbacb2 in lookup_cmd_1(char const**, cmd_list_element*, cmd_list_element**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, bool) /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:2186
      #2 0x564785cbb539 in lookup_cmd_1(char const**, cmd_list_element*, cmd_list_element**, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, bool) /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:2248
      #3 0x564785cbbcf3 in lookup_cmd(char const**, cmd_list_element*, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, int) /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:2339
      #4 0x564785c82df2 in setting_cmd /tmp/src/binutils-gdb/gdb/cli/cli-cmds.c:2219
      #5 0x564785c84274 in gdb_maint_setting_internal_fn /tmp/src/binutils-gdb/gdb/cli/cli-cmds.c:2348
      #6 0x564788167b3b in call_internal_function(gdbarch*, language_defn const*, value*, int, value**) /tmp/src/binutils-gdb/gdb/value.c:2321
      #7 0x5647854b6ebd in expr::ada_funcall_operation::evaluate(type*, expression*, noside) /tmp/src/binutils-gdb/gdb/ada-lang.c:11254
      #8 0x564786658266 in expression::evaluate(type*, noside) /tmp/src/binutils-gdb/gdb/eval.c:111
      #9 0x5647871242d6 in process_print_command_args /tmp/src/binutils-gdb/gdb/printcmd.c:1322
      #10 0x5647871244b3 in print_command_1 /tmp/src/binutils-gdb/gdb/printcmd.c:1335
      #11 0x564787125384 in print_command /tmp/src/binutils-gdb/gdb/printcmd.c:1468
      #12 0x564785caac44 in do_simple_func /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:95
      #13 0x564785cc18f0 in cmd_func(cmd_list_element*, char const*, int) /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:2735
      #14 0x564787c70c68 in execute_command(char const*, int) /tmp/src/binutils-gdb/gdb/top.c:574
      #15 0x564786686180 in command_handler(char const*) /tmp/src/binutils-gdb/gdb/event-top.c:543
      #16 0x56478668752f in command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) /tmp/src/binutils-gdb/gdb/event-top.c:779
      #17 0x564787dcb29a in tui_command_line_handler /tmp/src/binutils-gdb/gdb/tui/tui-interp.c:104
      #18 0x56478668443d in gdb_rl_callback_handler /tmp/src/binutils-gdb/gdb/event-top.c:250
      #19 0x7f4efd506246 in rl_callback_read_char (/usr/lib/libreadline.so.8+0x3b246) (BuildId: 092e91fc4361b0ef94561e3ae03a75f69398acbb)
      #20 0x564786683dea in gdb_rl_callback_read_char_wrapper_noexcept /tmp/src/binutils-gdb/gdb/event-top.c:192
      #21 0x564786684042 in gdb_rl_callback_read_char_wrapper /tmp/src/binutils-gdb/gdb/event-top.c:225
      #22 0x564787f1b119 in stdin_event_handler /tmp/src/binutils-gdb/gdb/ui.c:155
      #23 0x56478862438d in handle_file_event /tmp/src/binutils-gdb/gdbsupport/event-loop.cc:573
      #24 0x564788624d23 in gdb_wait_for_event /tmp/src/binutils-gdb/gdbsupport/event-loop.cc:694
      #25 0x56478862297c in gdb_do_one_event(int) /tmp/src/binutils-gdb/gdbsupport/event-loop.cc:264
      #26 0x564786df99f0 in start_event_loop /tmp/src/binutils-gdb/gdb/main.c:412
      #27 0x564786dfa069 in captured_command_loop /tmp/src/binutils-gdb/gdb/main.c:476
      #28 0x564786dff61f in captured_main /tmp/src/binutils-gdb/gdb/main.c:1320
      #29 0x564786dff75c in gdb_main(captured_main_args*) /tmp/src/binutils-gdb/gdb/main.c:1339
      #30 0x564785381b6d in main /tmp/src/binutils-gdb/gdb/gdb.c:32
      #31 0x7f4efbc3984f  (/usr/lib/libc.so.6+0x2384f) (BuildId: 2f005a79cd1a8e385972f5a102f16adba414d75e)
      #32 0x7f4efbc39909 in __libc_start_main (/usr/lib/libc.so.6+0x23909) (BuildId: 2f005a79cd1a8e385972f5a102f16adba414d75e)
      #33 0x564785381934 in _start (/tmp/build/binutils-gdb/gdb/gdb+0xabc5934) (BuildId: 90de353ac158646e7dab501b76a18a76628fca33)

  0x603000068034 is located 0 bytes after 20-byte region [0x603000068020,0x603000068034) allocated by thread T0 here:
      #0 0x7f4efcee0cd1 in __interceptor_calloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77
      #1 0x5647856265d8 in xcalloc /tmp/src/binutils-gdb/gdb/alloc.c:97
      #2 0x564788610c6b in xzalloc(unsigned long) /tmp/src/binutils-gdb/gdbsupport/common-utils.cc:29
      #3 0x56478815721a in value::allocate_contents(bool) /tmp/src/binutils-gdb/gdb/value.c:929
      #4 0x564788157285 in value::allocate(type*, bool) /tmp/src/binutils-gdb/gdb/value.c:941
      #5 0x56478815733a in value::allocate(type*) /tmp/src/binutils-gdb/gdb/value.c:951
      #6 0x5647854ae81c in expr::ada_string_operation::evaluate(type*, expression*, noside) /tmp/src/binutils-gdb/gdb/ada-lang.c:10675
      #7 0x5647854b63b8 in expr::ada_funcall_operation::evaluate(type*, expression*, noside) /tmp/src/binutils-gdb/gdb/ada-lang.c:11184
      #8 0x564786658266 in expression::evaluate(type*, noside) /tmp/src/binutils-gdb/gdb/eval.c:111
      #9 0x5647871242d6 in process_print_command_args /tmp/src/binutils-gdb/gdb/printcmd.c:1322
      #10 0x5647871244b3 in print_command_1 /tmp/src/binutils-gdb/gdb/printcmd.c:1335
      #11 0x564787125384 in print_command /tmp/src/binutils-gdb/gdb/printcmd.c:1468
      #12 0x564785caac44 in do_simple_func /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:95
      #13 0x564785cc18f0 in cmd_func(cmd_list_element*, char const*, int) /tmp/src/binutils-gdb/gdb/cli/cli-decode.c:2735
      #14 0x564787c70c68 in execute_command(char const*, int) /tmp/src/binutils-gdb/gdb/top.c:574
      #15 0x564786686180 in command_handler(char const*) /tmp/src/binutils-gdb/gdb/event-top.c:543
      #16 0x56478668752f in command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) /tmp/src/binutils-gdb/gdb/event-top.c:779
      #17 0x564787dcb29a in tui_command_line_handler /tmp/src/binutils-gdb/gdb/tui/tui-interp.c:104
      #18 0x56478668443d in gdb_rl_callback_handler /tmp/src/binutils-gdb/gdb/event-top.c:250
      #19 0x7f4efd506246 in rl_callback_read_char (/usr/lib/libreadline.so.8+0x3b246) (BuildId: 092e91fc4361b0ef94561e3ae03a75f69398acbb)

The problem is in cli/cli-cmds.c, in the function setting_cmd, where
we do this:

  const char *a0 = (const char *) argv[0]->contents ().data ();

Here argv[0] is a value* which we know is either a TYPE_CODE_ARRAY or
a TYPE_CODE_STRING.  The problem is that the above line is casting the
value contents directly to a C-string, i.e. one that is assumed to
have a null-terminator at the end.

After the above commit this can no longer be assumed to be true.  A
string value will be represented just as it would be in the current
language, so for Ada and Fortran the string will be an array of
characters with no null-terminator at the end.

My proposed solution is to copy the string contents into a std::string
object, and then use the std::string::c_str() value, this will ensure
that a null-terminator has been added.

I had a check through GDB at places TYPE_CODE_STRING was used and
couldn't see any other obvious places where this type of assumption
was being made, so hopefully this is the only offender.

Running the above test with ASan compiled in no longer gives an error.

Reviewed-By: Tom Tromey <tom@tromey.com>
tromey pushed a commit that referenced this issue Jul 17, 2023
While working on a later patch, which changes gdb.base/foll-vfork.exp,
I noticed that sometimes I would hit this assert:

  x86_linux_update_debug_registers: Assertion `lwp_is_stopped (lwp)' failed.

I eventually tracked it down to a combination of schedule-multiple
mode being on, target-non-stop being off, follow-fork-mode being set
to child, and some bad timing.  The failing case is pretty simple, a
single threaded application performs a vfork, the child process then
execs some other application while the parent process (once the vfork
child has completed its exec) just exits.  As best I understand
things, here's what happens when things go wrong:

  1. The parent process performs a vfork, GDB sees the VFORKED event
  and creates an inferior and thread for the vfork child,

  2. GDB resumes the vfork child process.  As schedule-multiple is on
  and target-non-stop is off, this is translated into a request to
  start all processes (see user_visible_resume_ptid),

  3. In the linux-nat layer we spot that one of the threads we are
  about to start is a vfork parent, and so don't start that
  thread (see resume_lwp), the vfork child thread is resumed,

  4. GDB waits for the next event, eventually entering
  linux_nat_target::wait, which in turn calls linux_nat_wait_1,

  5. In linux_nat_wait_1 we eventually call
  resume_stopped_resumed_lwps, this should restart threads that have
  stopped but don't actually have anything interesting to report.

  6. Unfortunately, resume_stopped_resumed_lwps doesn't check for
  vfork parents like resume_lwp does, so at this point the vfork
  parent is resumed.  This feels like the start of the bug, and this
  is where I'm proposing to fix things, but, resuming the vfork parent
  isn't the worst thing in the world because....

  7. As the vfork child is still alive the kernel holds the vfork
  parent stopped,

  8. Eventually the child performs its exec and GDB is sent and EXECD
  event.  However, because the parent is resumed, as soon as the child
  performs its exec the vfork parent also sends a VFORK_DONE event to
  GDB,

  9. Depending on timing both of these events might seem to arrive in
  GDB at the same time.  Normally GDB expects to see the EXECD or
  EXITED/SIGNALED event from the vfork child before getting the
  VFORK_DONE in the parent.  We know this because it is as a result of
  the EXECD/EXITED/SIGNALED that GDB detaches from the parent (see
  handle_vfork_child_exec_or_exit for details).  Further the comment
  in target/waitstatus.h on TARGET_WAITKIND_VFORK_DONE indicates that
  when we remain attached to the child (not the parent) we should not
  expect to see a VFORK_DONE,

  10. If both events arrive at the same time then GDB will randomly
  choose one event to handle first, in some cases this will be the
  VFORK_DONE.  As described above, upon seeing a VFORK_DONE GDB
  expects that (a) the vfork child has finished, however, in this case
  this is not completely true, the child has finished, but GDB has not
  processed the event associated with the completion yet, and (b) upon
  seeing a VFORK_DONE GDB assumes we are remaining attached to the
  parent, and so resumes the parent process,

  11. GDB now handles the EXECD event.  In our case we are detaching
  from the parent, so GDB calls target_detach (see
  handle_vfork_child_exec_or_exit),

  12. While this has been going on the vfork parent is executing, and
  might even exit,

  13. In linux_nat_target::detach the first thing we do is stop all
  threads in the process we're detaching from, the result of the stop
  request will be cached on the lwp_info object,

  14. In our case the vfork parent has exited though, so when GDB
  waits for the thread, instead of a stop due to signal, we instead
  get a thread exited status,

  15. Later in the detach process we try to resume the threads just
  prior to making the ptrace call to actually detach (see
  detach_one_lwp), as part of the process to resume a thread we try to
  touch some registers within the thread, and before doing this GDB
  asserts that the thread is stopped,

  16. An exited thread is not classified as stopped, and so the assert
  triggers!

So there's two bugs I see here.  The first, and most critical one here
is in step #6.  I think that resume_stopped_resumed_lwps should not
resume a vfork parent, just like resume_lwp doesn't resume a vfork
parent.

With this change in place the vfork parent will remain stopped in step
instead GDB will only see the EXECD/EXITED/SIGNALLED event.  The
problems in #9 and #10 are therefore skipped and we arrive at #11,
handling the EXECD event.  As the parent is still stopped #12 doesn't
apply, and in #13 when we try to stop the process we will see that it
is already stopped, there's no risk of the vfork parent exiting before
we get to this point.  And finally, in #15 we are safe to poke the
process registers because it will not have exited by this point.

However, I did mention two bugs.

The second bug I've not yet managed to actually trigger, but I'm
convinced it must exist: if we forget vforks for a moment, in step #13
above, when linux_nat_target::detach is called, we first try to stop
all threads in the process GDB is detaching from.  If we imagine a
multi-threaded inferior with many threads, and GDB running in non-stop
mode, then, if the user tries to detach there is a chance that thread
could exit just as linux_nat_target::detach is entered, in which case
we should be able to trigger the same assert.

But, like I said, I've not (yet) managed to trigger this second bug,
and even if I could, the fix would not belong in this commit, so I'm
pointing this out just for completeness.

There's no test included in this commit.  In a couple of commits time
I will expand gdb.base/foll-vfork.exp which is when this bug would be
exposed.  Unfortunately there are at least two other bugs (separate
from the ones discussed above) that need fixing first, these will be
fixed in the next commits before the gdb.base/foll-vfork.exp test is
expanded.

If you do want to reproduce this failure then you will for certainly
need to run the gdb.base/foll-vfork.exp test in a loop as the failures
are all very timing sensitive.  I've found that running multiple
copies in parallel makes the failure more likely to appear, I usually
run ~6 copies in parallel and expect to see a failure after within
10mins.
tromey pushed a commit that referenced this issue Aug 4, 2023
With gdb build with -fsanitize=thread and test-case gdb.base/index-cache.exp I
run into:
...
(gdb) file build/gdb/testsuite/outputs/gdb.base/index-cache/index-cache
Reading symbols from build/gdb/testsuite/outputs/gdb.base/index-cache/index-cache...
(gdb) show index-cache enabled
The index cache is off.
(gdb) PASS: gdb.base/index-cache.exp: test_basic_stuff: index-cache is disabled by default
set index-cache enabled on
==================
WARNING: ThreadSanitizer: data race (pid=32248)
  Write of size 1 at 0x00000321f540 by main thread:
    #0 index_cache::enable() gdb/dwarf2/index-cache.c:76 (gdb+0x82cfdd)
    #1 set_index_cache_enabled_command gdb/dwarf2/index-cache.c:270 (gdb+0x82d9af)
    #2 bool setting::set<bool>(bool const&) gdb/command.h:353 (gdb+0x6fe5f2)
    #3 do_set_command(char const*, int, cmd_list_element*) gdb/cli/cli-setshow.c:414 (gdb+0x6fcd21)
    #4 execute_command(char const*, int) gdb/top.c:567 (gdb+0xff2e64)
    #5 command_handler(char const*) gdb/event-top.c:552 (gdb+0x94acc0)
    #6 command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) gdb/event-top.c:788 (gdb+0x94b37d)
    #7 tui_command_line_handler gdb/tui/tui-interp.c:104 (gdb+0x103467e)
    #8 gdb_rl_callback_handler gdb/event-top.c:259 (gdb+0x94a265)
    #9 rl_callback_read_char readline/readline/callback.c:290 (gdb+0x11bdd3f)
    #10 gdb_rl_callback_read_char_wrapper_noexcept gdb/event-top.c:195 (gdb+0x94a064)
    #11 gdb_rl_callback_read_char_wrapper gdb/event-top.c:234 (gdb+0x94a125)
    #12 stdin_event_handler gdb/ui.c:155 (gdb+0x1074922)
    #13 handle_file_event gdbsupport/event-loop.cc:573 (gdb+0x1d94de4)
    #14 gdb_wait_for_event gdbsupport/event-loop.cc:694 (gdb+0x1d9551c)
    #15 gdb_do_one_event(int) gdbsupport/event-loop.cc:264 (gdb+0x1d93908)
    #16 start_event_loop gdb/main.c:412 (gdb+0xb5a256)
    #17 captured_command_loop gdb/main.c:476 (gdb+0xb5a445)
    #18 captured_main gdb/main.c:1320 (gdb+0xb5c5c5)
    #19 gdb_main(captured_main_args*) gdb/main.c:1339 (gdb+0xb5c674)
    #20 main gdb/gdb.c:32 (gdb+0x416776)

  Previous read of size 1 at 0x00000321f540 by thread T12:
    #0 index_cache::enabled() const gdb/dwarf2/index-cache.h:48 (gdb+0x82e1a6)
    #1 index_cache::store(dwarf2_per_bfd*) gdb/dwarf2/index-cache.c:94 (gdb+0x82d0bc)
    #2 cooked_index::maybe_write_index(dwarf2_per_bfd*) gdb/dwarf2/cooked-index.c:638 (gdb+0x7f1b97)
    #3 operator() gdb/dwarf2/cooked-index.c:468 (gdb+0x7f0f24)
    #4 _M_invoke /usr/include/c++/7/bits/std_function.h:316 (gdb+0x7f285b)
    #5 std::function<void ()>::operator()() const /usr/include/c++/7/bits/std_function.h:706 (gdb+0x700952)
    #6 void std::__invoke_impl<void, std::function<void ()>&>(std::__invoke_other, std::function<void ()>&) /usr/include/c++/7/bits/invoke.h:60 (gdb+0x7381a0)
    #7 std::__invoke_result<std::function<void ()>&>::type std::__invoke<std::function<void ()>&>(std::function<void ()>&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x737e91)
    #8 std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}::operator()() const /usr/include/c++/7/future:1421 (gdb+0x737b59)
    #9 std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}, void>::operator()() const /usr/include/c++/7/future:1362 (gdb+0x738660)
    #10 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}, void> >::_M_invoke(std::_Any_data const&) /usr/include/c++/7/bits/std_function.h:302 (gdb+0x73825c)
    #11 std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>::operator()() const /usr/include/c++/7/bits/std_function.h:706 (gdb+0x733623)
    #12 std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) /usr/include/c++/7/future:561 (gdb+0x732bdf)
    #13 void std::__invoke_impl<void, void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::__invoke_memfun_deref, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/bits/invoke.h:73 (gdb+0x734c4f)
    #14 std::__invoke_result<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>::type std::__invoke<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x733bc5)
    #15 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#1}::operator()() const /usr/include/c++/7/mutex:672 (gdb+0x73300d)
    #16 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#2}::operator()() const /usr/include/c++/7/mutex:677 (gdb+0x7330b2)
    #17 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#2}::_FUN() /usr/include/c++/7/mutex:677 (gdb+0x7330f2)
    #18 pthread_once <null> (libtsan.so.0+0x4457c)
    #19 __gthread_once /usr/include/c++/7/x86_64-suse-linux/bits/gthr-default.h:699 (gdb+0x72f5dd)
    #20 void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/mutex:684 (gdb+0x733224)
    #21 std::__future_base::_State_baseV2::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>, bool) /usr/include/c++/7/future:401 (gdb+0x732852)
    #22 std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run() /usr/include/c++/7/future:1423 (gdb+0x737bef)
    #23 std::packaged_task<void ()>::operator()() /usr/include/c++/7/future:1556 (gdb+0x1dac492)
    #24 gdb::thread_pool::thread_function() gdbsupport/thread-pool.cc:242 (gdb+0x1dabdb4)
    #25 void std::__invoke_impl<void, void (gdb::thread_pool::*)(), gdb::thread_pool*>(std::__invoke_memfun_deref, void (gdb::thread_pool::*&&)(), gdb::thread_pool*&&) /usr/include/c++/7/bits/invoke.h:73 (gdb+0x1dace63)
    #26 std::__invoke_result<void (gdb::thread_pool::*)(), gdb::thread_pool*>::type std::__invoke<void (gdb::thread_pool::*)(), gdb::thread_pool*>(void (gdb::thread_pool::*&&)(), gdb::thread_pool*&&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x1dac294)
    #27 decltype (__invoke((_S_declval<0ul>)(), (_S_declval<1ul>)())) std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/include/c++/7/thread:234 (gdb+0x1daf5c6)
    #28 std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> >::operator()() /usr/include/c++/7/thread:243 (gdb+0x1daf551)
    #29 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> > >::_M_run() /usr/include/c++/7/thread:186 (gdb+0x1daf506)
    #30 <null> <null> (libstdc++.so.6+0xdcac2)

  Location is global 'global_index_cache' of size 48 at 0x00000321f520 (gdb+0x00000321f540)
  ...
SUMMARY: ThreadSanitizer: data race gdb/dwarf2/index-cache.c:76 in index_cache::enable()
...

The race happens when issuing a "file $exec" command followed by a
"set index-cache enabled on" command.

The race is between:
- a worker thread reading index_cache::m_enabled to determine whether an
  index-cache entry for $exec needs to be written
  (due to command "file $exec"), and
- the main thread setting index_cache::m_enabled
  (due to command "set index-cache enabled on").

Fix this by capturing the value of index_cache::m_enabled in the main thread,
and using the captured value in the worker thread.

Tested on x86_64-linux.

PR symtab/30392
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30392
tromey pushed a commit that referenced this issue Aug 4, 2023
With gdb build with -fsanitize=thread and test-case gdb.base/index-cache.exp I
run into:
...
(gdb) file build/gdb/testsuite/outputs/gdb.base/index-cache/index-cache
Reading symbols from build/gdb/testsuite/outputs/gdb.base/index-cache/index-cache...
==================
WARNING: ThreadSanitizer: data race (pid=12261)
  Write of size 4 at 0x7b4400097d08 by main thread:
    #0 bfd_open_file bfd/cache.c:584 (gdb+0x148bb92)
    #1 bfd_cache_lookup_worker bfd/cache.c:261 (gdb+0x148b12a)
    #2 cache_bseek bfd/cache.c:289 (gdb+0x148b324)
    #3 bfd_seek bfd/bfdio.c:459 (gdb+0x1489c31)
    #4 _bfd_generic_get_section_contents bfd/libbfd.c:1069 (gdb+0x14977a4)
    #5 bfd_get_section_contents bfd/section.c:1606 (gdb+0x149cc7c)
    #6 gdb_bfd_scan_elf_dyntag(int, bfd*, unsigned long*, unsigned long*) gdb/solib.c:1601 (gdb+0xed8eca)
    #7 elf_locate_base gdb/solib-svr4.c:705 (gdb+0xec28ac)
    #8 svr4_iterate_over_objfiles_in_search_order gdb/solib-svr4.c:3430 (gdb+0xeca55d)
    #9 gdbarch_iterate_over_objfiles_in_search_order(gdbarch*, gdb::function_view<bool (objfile*)>, objfile*) gdb/gdbarch.c:5041 (gdb+0x537cad)
    #10 find_main_name gdb/symtab.c:6270 (gdb+0xf743a5)
    #11 main_language() gdb/symtab.c:6313 (gdb+0xf74499)
    #12 set_initial_language() gdb/symfile.c:1700 (gdb+0xf4285c)
    #13 symbol_file_add_main_1 gdb/symfile.c:1212 (gdb+0xf40e2a)
    #14 symbol_file_command(char const*, int) gdb/symfile.c:1681 (gdb+0xf427d1)
    #15 file_command gdb/exec.c:554 (gdb+0x94f74b)
    #16 do_simple_func gdb/cli/cli-decode.c:95 (gdb+0x6d9528)
    #17 cmd_func(cmd_list_element*, char const*, int) gdb/cli/cli-decode.c:2735 (gdb+0x6e0f69)
    #18 execute_command(char const*, int) gdb/top.c:575 (gdb+0xff303c)
    #19 command_handler(char const*) gdb/event-top.c:552 (gdb+0x94adde)
    #20 command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) gdb/event-top.c:788 (gdb+0x94b49b)
    #21 tui_command_line_handler gdb/tui/tui-interp.c:104 (gdb+0x103479c)
    #22 gdb_rl_callback_handler gdb/event-top.c:259 (gdb+0x94a383)
    #23 rl_callback_read_char readline/readline/callback.c:290 (gdb+0x11bde5d)
    #24 gdb_rl_callback_read_char_wrapper_noexcept gdb/event-top.c:195 (gdb+0x94a182)
    #25 gdb_rl_callback_read_char_wrapper gdb/event-top.c:234 (gdb+0x94a243)
    #26 stdin_event_handler gdb/ui.c:155 (gdb+0x1074a40)
    #27 handle_file_event gdbsupport/event-loop.cc:573 (gdb+0x1d94f02)
    #28 gdb_wait_for_event gdbsupport/event-loop.cc:694 (gdb+0x1d9563a)
    #29 gdb_do_one_event(int) gdbsupport/event-loop.cc:264 (gdb+0x1d93a26)
    #30 start_event_loop gdb/main.c:412 (gdb+0xb5a374)
    #31 captured_command_loop gdb/main.c:476 (gdb+0xb5a563)
    #32 captured_main gdb/main.c:1320 (gdb+0xb5c6e3)
    #33 gdb_main(captured_main_args*) gdb/main.c:1339 (gdb+0xb5c792)
    #34 main gdb/gdb.c:32 (gdb+0x416776)

  Previous read of size 1 at 0x7b4400097d08 by thread T12:
    #0 bfd_check_format_matches bfd/format.c:323 (gdb+0x1492db4)
    #1 bfd_check_format bfd/format.c:94 (gdb+0x1492104)
    #2 build_id_bfd_get(bfd*) gdb/build-id.c:42 (gdb+0x6648f7)
    #3 index_cache::store(dwarf2_per_bfd*, index_cache_store_context*) gdb/dwarf2/index-cache.c:110 (gdb+0x82d205)
    #4 cooked_index::maybe_write_index(dwarf2_per_bfd*) gdb/dwarf2/cooked-index.c:640 (gdb+0x7f1bf1)
    #5 operator() gdb/dwarf2/cooked-index.c:470 (gdb+0x7f0f40)
    #6 _M_invoke /usr/include/c++/7/bits/std_function.h:316 (gdb+0x7f28f7)
    #7 std::function<void ()>::operator()() const /usr/include/c++/7/bits/std_function.h:706 (gdb+0x700952)
    #8 void std::__invoke_impl<void, std::function<void ()>&>(std::__invoke_other, std::function<void ()>&) /usr/include/c++/7/bits/invoke.h:60 (gdb+0x7381a0)
    #9 std::__invoke_result<std::function<void ()>&>::type std::__invoke<std::function<void ()>&>(std::function<void ()>&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x737e91)
    #10 std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}::operator()() const /usr/include/c++/7/future:1421 (gdb+0x737b59)
    #11 std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}, void>::operator()() const /usr/include/c++/7/future:1362 (gdb+0x738660)
    #12 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}, void> >::_M_invoke(std::_Any_data const&) /usr/include/c++/7/bits/std_function.h:302 (gdb+0x73825c)
    #13 std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>::operator()() const /usr/include/c++/7/bits/std_function.h:706 (gdb+0x733623)
    #14 std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) /usr/include/c++/7/future:561 (gdb+0x732bdf)
    #15 void std::__invoke_impl<void, void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::__invoke_memfun_deref, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/bits/invoke.h:73 (gdb+0x734c4f)
    #16 std::__invoke_result<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>::type std::__invoke<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x733bc5)
    #17 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#1}::operator()() const /usr/include/c++/7/mutex:672 (gdb+0x73300d)
    #18 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#2}::operator()() const /usr/include/c++/7/mutex:677 (gdb+0x7330b2)
    #19 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#2}::_FUN() /usr/include/c++/7/mutex:677 (gdb+0x7330f2)
    #20 pthread_once <null> (libtsan.so.0+0x4457c)
    #21 __gthread_once /usr/include/c++/7/x86_64-suse-linux/bits/gthr-default.h:699 (gdb+0x72f5dd)
    #22 void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/mutex:684 (gdb+0x733224)
    #23 std::__future_base::_State_baseV2::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>, bool) /usr/include/c++/7/future:401 (gdb+0x732852)
    #24 std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run() /usr/include/c++/7/future:1423 (gdb+0x737bef)
    #25 std::packaged_task<void ()>::operator()() /usr/include/c++/7/future:1556 (gdb+0x1dac5b0)
    #26 gdb::thread_pool::thread_function() gdbsupport/thread-pool.cc:242 (gdb+0x1dabed2)
    #27 void std::__invoke_impl<void, void (gdb::thread_pool::*)(), gdb::thread_pool*>(std::__invoke_memfun_deref, void (gdb::thread_pool::*&&)(), gdb::thread_pool*&&) /usr/include/c++/7/bits/invoke.h:73 (gdb+0x1dacf81)
    #28 std::__invoke_result<void (gdb::thread_pool::*)(), gdb::thread_pool*>::type std::__invoke<void (gdb::thread_pool::*)(), gdb::thread_pool*>(void (gdb::thread_pool::*&&)(), gdb::thread_pool*&&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x1dac3b2)
    #29 decltype (__invoke((_S_declval<0ul>)(), (_S_declval<1ul>)())) std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/include/c++/7/thread:234 (gdb+0x1daf6e4)
    #30 std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> >::operator()() /usr/include/c++/7/thread:243 (gdb+0x1daf66f)
    #31 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> > >::_M_run() /usr/include/c++/7/thread:186 (gdb+0x1daf624)
    #32 <null> <null> (libstdc++.so.6+0xdcac2)
  ...
SUMMARY: ThreadSanitizer: data race bfd/cache.c:584 in bfd_open_file
...

The race happens when issuing the "file $exec" command.

The race is between:
- a worker thread getting the build id while writing the index cache, and in
  the process reading bfd::format, and
- the main thread calling find_main_name, and in the process setting
  bfd::cacheable.

The two bitfields bfd::cacheable and bfd::format share the same bitfield
container.

Fix this by capturing the build id in the main thread, and using the captured
value in the worker thread.

Likewise for the dwz build id, which likely suffers from the same issue.

While we're at it, also move the creation of the cache directory to
the index_cache_store_context constructor, to:
- make sure there's no race between subsequent file commands, and
- issue any related warning or error messages during the file command.

Tested on x86_64-linux.

Approved-By: Tom Tromey <tom@tromey.com>

PR symtab/30392
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30392
tromey pushed a commit that referenced this issue Aug 4, 2023
With gdb build with -fsanitize=thread and test-case gdb.base/index-cache.exp I
run into:
...
(gdb) file build/gdb/testsuite/outputs/gdb.base/index-cache/index-cache
Reading symbols from build/gdb/testsuite/outputs/gdb.base/index-cache/index-cache...
==================
WARNING: ThreadSanitizer: data race (pid=24296)
  Write of size 1 at 0x7b200000420d by main thread:
    #0 queue_comp_unit gdb/dwarf2/read.c:5564 (gdb+0x8939ce)
    #1 dw2_do_instantiate_symtab gdb/dwarf2/read.c:1754 (gdb+0x885b96)
    #2 dw2_instantiate_symtab gdb/dwarf2/read.c:1792 (gdb+0x885d86)
    #3 dw2_expand_symtabs_matching_one(dwarf2_per_cu_data*, dwarf2_per_objfile*, gdb::function_view<bool (char const*, bool)>, gdb::function_view<bool (compunit_symtab*)>) gdb/dwarf2/read.c:3042 (gdb+0x88ac77)
    #4 cooked_index_functions::expand_symtabs_matching(objfile*, gdb::function_view<bool (char const*, bool)>, lookup_name_info const*, gdb::function_view<bool (char const*)>, gdb::function_view<bool (compunit_symtab*)>, enum_flags<block_search_flag_values>, domain_enum, search_domain) gdb/dwarf2/read.c:16915 (gdb+0x8c1c8a)
    #5 objfile::lookup_symbol(block_enum, char const*, domain_enum) gdb/symfile-debug.c:288 (gdb+0xf389a1)
    #6 lookup_symbol_via_quick_fns gdb/symtab.c:2385 (gdb+0xf66403)
    #7 lookup_symbol_in_objfile gdb/symtab.c:2516 (gdb+0xf66a67)
    #8 operator() gdb/symtab.c:2562 (gdb+0xf66bbe)
    #9 operator() gdb/../gdbsupport/function-view.h:305 (gdb+0xf76ffd)
    #10 _FUN gdb/../gdbsupport/function-view.h:299 (gdb+0xf77054)
    #11 gdb::function_view<bool (objfile*)>::operator()(objfile*) const gdb/../gdbsupport/function-view.h:289 (gdb+0xc3f5e3)
    #12 svr4_iterate_over_objfiles_in_search_order gdb/solib-svr4.c:3455 (gdb+0xeca793)
    #13 gdbarch_iterate_over_objfiles_in_search_order(gdbarch*, gdb::function_view<bool (objfile*)>, objfile*) gdb/gdbarch.c:5041 (gdb+0x537cad)
    #14 lookup_global_or_static_symbol gdb/symtab.c:2559 (gdb+0xf66e47)
    #15 lookup_global_symbol(char const*, block const*, domain_enum) gdb/symtab.c:2615 (gdb+0xf670cc)
    #16 language_defn::lookup_symbol_nonlocal(char const*, block const*, domain_enum) const gdb/symtab.c:2447 (gdb+0xf666ba)
    #17 lookup_symbol_aux gdb/symtab.c:2123 (gdb+0xf655ff)
    #18 lookup_symbol_in_language(char const*, block const*, domain_enum, language, field_of_this_result*) gdb/symtab.c:1931 (gdb+0xf646f7)
    #19 set_initial_language() gdb/symfile.c:1708 (gdb+0xf429c0)
    #20 symbol_file_add_main_1 gdb/symfile.c:1212 (gdb+0xf40f54)
    #21 symbol_file_command(char const*, int) gdb/symfile.c:1681 (gdb+0xf428fb)
    #22 file_command gdb/exec.c:554 (gdb+0x94f875)
    #23 do_simple_func gdb/cli/cli-decode.c:95 (gdb+0x6d9528)
    #24 cmd_func(cmd_list_element*, char const*, int) gdb/cli/cli-decode.c:2735 (gdb+0x6e0f69)
    #25 execute_command(char const*, int) gdb/top.c:575 (gdb+0xff3166)
    #26 command_handler(char const*) gdb/event-top.c:552 (gdb+0x94af08)
    #27 command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) gdb/event-top.c:788 (gdb+0x94b5c5)
    #28 tui_command_line_handler gdb/tui/tui-interp.c:104 (gdb+0x10348c6)
    #29 gdb_rl_callback_handler gdb/event-top.c:259 (gdb+0x94a4ad)
    #30 rl_callback_read_char readline/readline/callback.c:290 (gdb+0x11bdf87)
    #31 gdb_rl_callback_read_char_wrapper_noexcept gdb/event-top.c:195 (gdb+0x94a2ac)
    #32 gdb_rl_callback_read_char_wrapper gdb/event-top.c:234 (gdb+0x94a36d)
    #33 stdin_event_handler gdb/ui.c:155 (gdb+0x1074b6a)
    #34 handle_file_event gdbsupport/event-loop.cc:573 (gdb+0x1d9502c)
    #35 gdb_wait_for_event gdbsupport/event-loop.cc:694 (gdb+0x1d95764)
    #36 gdb_do_one_event(int) gdbsupport/event-loop.cc:264 (gdb+0x1d93b50)
    #37 start_event_loop gdb/main.c:412 (gdb+0xb5a49e)
    #38 captured_command_loop gdb/main.c:476 (gdb+0xb5a68d)
    #39 captured_main gdb/main.c:1320 (gdb+0xb5c80d)
    #40 gdb_main(captured_main_args*) gdb/main.c:1339 (gdb+0xb5c8bc)
    #41 main gdb/gdb.c:32 (gdb+0x416776)

  Previous read of size 1 at 0x7b200000420d by thread T12:
    #0 write_gdbindex gdb/dwarf2/index-write.c:1229 (gdb+0x8310c8)
    #1 write_dwarf_index(dwarf2_per_bfd*, char const*, char const*, char const*, dw_index_kind) gdb/dwarf2/index-write.c:1484 (gdb+0x83232f)
    #2 index_cache::store(dwarf2_per_bfd*, index_cache_store_context*) gdb/dwarf2/index-cache.c:177 (gdb+0x82d62b)
    #3 cooked_index::maybe_write_index(dwarf2_per_bfd*) gdb/dwarf2/cooked-index.c:640 (gdb+0x7f1bf7)
    #4 operator() gdb/dwarf2/cooked-index.c:470 (gdb+0x7f0f40)
    #5 _M_invoke /usr/include/c++/7/bits/std_function.h:316 (gdb+0x7f2909)
    #6 std::function<void ()>::operator()() const /usr/include/c++/7/bits/std_function.h:706 (gdb+0x700952)
    #7 void std::__invoke_impl<void, std::function<void ()>&>(std::__invoke_other, std::function<void ()>&) /usr/include/c++/7/bits/invoke.h:60 (gdb+0x7381a0)
    #8 std::__invoke_result<std::function<void ()>&>::type std::__invoke<std::function<void ()>&>(std::function<void ()>&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x737e91)
    #9 std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}::operator()() const /usr/include/c++/7/future:1421 (gdb+0x737b59)
    #10 std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}, void>::operator()() const /usr/include/c++/7/future:1362 (gdb+0x738660)
    #11 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}, void> >::_M_invoke(std::_Any_data const&) /usr/include/c++/7/bits/std_function.h:302 (gdb+0x73825c)
    #12 std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>::operator()() const /usr/include/c++/7/bits/std_function.h:706 (gdb+0x733623)
    #13 std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) /usr/include/c++/7/future:561 (gdb+0x732bdf)
    #14 void std::__invoke_impl<void, void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::__invoke_memfun_deref, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/bits/invoke.h:73 (gdb+0x734c4f)
    #15 std::__invoke_result<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>::type std::__invoke<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x733bc5)
    #16 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#1}::operator()() const /usr/include/c++/7/mutex:672 (gdb+0x73300d)
    #17 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#2}::operator()() const /usr/include/c++/7/mutex:677 (gdb+0x7330b2)
    #18 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#2}::_FUN() /usr/include/c++/7/mutex:677 (gdb+0x7330f2)
    #19 pthread_once <null> (libtsan.so.0+0x4457c)
    #20 __gthread_once /usr/include/c++/7/x86_64-suse-linux/bits/gthr-default.h:699 (gdb+0x72f5dd)
    #21 void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/mutex:684 (gdb+0x733224)
    #22 std::__future_base::_State_baseV2::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>, bool) /usr/include/c++/7/future:401 (gdb+0x732852)
    #23 std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run() /usr/include/c++/7/future:1423 (gdb+0x737bef)
    #24 std::packaged_task<void ()>::operator()() /usr/include/c++/7/future:1556 (gdb+0x1dac6da)
    #25 gdb::thread_pool::thread_function() gdbsupport/thread-pool.cc:242 (gdb+0x1dabffc)
    #26 void std::__invoke_impl<void, void (gdb::thread_pool::*)(), gdb::thread_pool*>(std::__invoke_memfun_deref, void (gdb::thread_pool::*&&)(), gdb::thread_pool*&&) /usr/include/c++/7/bits/invoke.h:73 (gdb+0x1dad0ab)
    #27 std::__invoke_result<void (gdb::thread_pool::*)(), gdb::thread_pool*>::type std::__invoke<void (gdb::thread_pool::*)(), gdb::thread_pool*>(void (gdb::thread_pool::*&&)(), gdb::thread_pool*&&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x1dac4dc)
    #28 decltype (__invoke((_S_declval<0ul>)(), (_S_declval<1ul>)())) std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/include/c++/7/thread:234 (gdb+0x1daf80e)
    #29 std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> >::operator()() /usr/include/c++/7/thread:243 (gdb+0x1daf799)
    #30 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> > >::_M_run() /usr/include/c++/7/thread:186 (gdb+0x1daf74e)
    #31 <null> <null> (libstdc++.so.6+0xdcac2)
 ...
SUMMARY: ThreadSanitizer: data race gdb/dwarf2/read.c:5564 in queue_comp_unit
...

The race happens when issuing the "file $exec" command.

The race is between:
- a worker thread writing the index cache, and in the process reading
  dwarf2_per_cu_data::is_debug_type, and
- the main thread expanding the CU containing main, and in the process setting
  dwarf2_per_cu_data::queued.

The two bitfields dwarf2_per_cu_data::queue and
dwarf2_per_cu_data::is_debug_type share the same bitfield container.

Fix this by making dwarf2_per_cu_data::queued a packed<bool, 1>.

Tested on x86_64-linux.

Approved-By: Tom Tromey <tom@tromey.com>

PR symtab/30392
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30392
tromey pushed a commit that referenced this issue Aug 4, 2023
…s_debug_type}

With gdb build with -fsanitize=thread and test-case gdb.base/index-cache.exp
and target board debug-types, I run into:
...
(gdb) file build/gdb/testsuite/outputs/gdb.base/index-cache/index-cache
Reading symbols from build/gdb/testsuite/outputs/gdb.base/index-cache/index-cache...
==================
WARNING: ThreadSanitizer: data race (pid=9654)
  Write of size 1 at 0x7b200000420d by main thread:
    #0 dwarf2_per_cu_data::get_header() const gdb/dwarf2/read.c:21513 (gdb+0x8d1eee)
    #1 dwarf2_per_cu_data::addr_size() const gdb/dwarf2/read.c:21524 (gdb+0x8d1f4e)
    #2 dwarf2_cu::addr_type() const gdb/dwarf2/cu.c:112 (gdb+0x806327)
    #3 set_die_type gdb/dwarf2/read.c:21932 (gdb+0x8d3870)
    #4 read_base_type gdb/dwarf2/read.c:15448 (gdb+0x8bcacb)
    #5 read_type_die_1 gdb/dwarf2/read.c:19832 (gdb+0x8cc0a5)
    #6 read_type_die gdb/dwarf2/read.c:19767 (gdb+0x8cbe6d)
    #7 lookup_die_type gdb/dwarf2/read.c:19739 (gdb+0x8cbdc7)
    #8 die_type gdb/dwarf2/read.c:19593 (gdb+0x8cb68a)
    #9 read_subroutine_type gdb/dwarf2/read.c:14648 (gdb+0x8b998e)
    #10 read_type_die_1 gdb/dwarf2/read.c:19792 (gdb+0x8cbf2f)
    #11 read_type_die gdb/dwarf2/read.c:19767 (gdb+0x8cbe6d)
    #12 read_func_scope gdb/dwarf2/read.c:10154 (gdb+0x8a4f36)
    #13 process_die gdb/dwarf2/read.c:6667 (gdb+0x898daa)
    #14 read_file_scope gdb/dwarf2/read.c:7682 (gdb+0x89bad8)
    #15 process_die gdb/dwarf2/read.c:6654 (gdb+0x898ced)
    #16 process_full_comp_unit gdb/dwarf2/read.c:6418 (gdb+0x8981de)
    #17 process_queue gdb/dwarf2/read.c:5690 (gdb+0x894433)
    #18 dw2_do_instantiate_symtab gdb/dwarf2/read.c:1770 (gdb+0x88623a)
    #19 dw2_instantiate_symtab gdb/dwarf2/read.c:1792 (gdb+0x886300)
    #20 dw2_expand_symtabs_matching_one(dwarf2_per_cu_data*, dwarf2_per_objfile*, gdb::function_view<bool (char const*, bool)>, gdb::function_view<bool (compunit_symtab*)>) gdb/dwarf2/read.c:3042 (gdb+0x88b1f1)
    #21 cooked_index_functions::expand_symtabs_matching(objfile*, gdb::function_view<bool (char const*, bool)>, lookup_name_info const*, gdb::function_view<bool (char const*)>, gdb::function_view<bool (compunit_symtab*)>, enum_flags<block_search_flag_values>, domain_enum, search_domain) gdb/dwarf2/read.c:16917 (gdb+0x8c228e)
    #22 objfile::lookup_symbol(block_enum, char const*, domain_enum) gdb/symfile-debug.c:288 (gdb+0xf39055)
    #23 lookup_symbol_via_quick_fns gdb/symtab.c:2385 (gdb+0xf66ab7)
    #24 lookup_symbol_in_objfile gdb/symtab.c:2516 (gdb+0xf6711b)
    #25 operator() gdb/symtab.c:2562 (gdb+0xf67272)
    #26 operator() gdb/../gdbsupport/function-view.h:305 (gdb+0xf776b1)
    #27 _FUN gdb/../gdbsupport/function-view.h:299 (gdb+0xf77708)
    #28 gdb::function_view<bool (objfile*)>::operator()(objfile*) const gdb/../gdbsupport/function-view.h:289 (gdb+0xc3fc97)
    #29 svr4_iterate_over_objfiles_in_search_order gdb/solib-svr4.c:3455 (gdb+0xecae47)
    #30 gdbarch_iterate_over_objfiles_in_search_order(gdbarch*, gdb::function_view<bool (objfile*)>, objfile*) gdb/gdbarch.c:5041 (gdb+0x537cad)
    #31 lookup_global_or_static_symbol gdb/symtab.c:2559 (gdb+0xf674fb)
    #32 lookup_global_symbol(char const*, block const*, domain_enum) gdb/symtab.c:2615 (gdb+0xf67780)
    #33 language_defn::lookup_symbol_nonlocal(char const*, block const*, domain_enum) const gdb/symtab.c:2447 (gdb+0xf66d6e)
    #34 lookup_symbol_aux gdb/symtab.c:2123 (gdb+0xf65cb3)
    #35 lookup_symbol_in_language(char const*, block const*, domain_enum, language, field_of_this_result*) gdb/symtab.c:1931 (gdb+0xf64dab)
    #36 set_initial_language() gdb/symfile.c:1708 (gdb+0xf43074)
    #37 symbol_file_add_main_1 gdb/symfile.c:1212 (gdb+0xf41608)
    #38 symbol_file_command(char const*, int) gdb/symfile.c:1681 (gdb+0xf42faf)
    #39 file_command gdb/exec.c:554 (gdb+0x94ff29)
    #40 do_simple_func gdb/cli/cli-decode.c:95 (gdb+0x6d9528)
    #41 cmd_func(cmd_list_element*, char const*, int) gdb/cli/cli-decode.c:2735 (gdb+0x6e0f69)
    #42 execute_command(char const*, int) gdb/top.c:575 (gdb+0xff379c)
    #43 command_handler(char const*) gdb/event-top.c:552 (gdb+0x94b5bc)
    #44 command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) gdb/event-top.c:788 (gdb+0x94bc79)
    #45 tui_command_line_handler gdb/tui/tui-interp.c:104 (gdb+0x1034efc)
    #46 gdb_rl_callback_handler gdb/event-top.c:259 (gdb+0x94ab61)
    #47 rl_callback_read_char readline/readline/callback.c:290 (gdb+0x11be4ef)
    #48 gdb_rl_callback_read_char_wrapper_noexcept gdb/event-top.c:195 (gdb+0x94a960)
    #49 gdb_rl_callback_read_char_wrapper gdb/event-top.c:234 (gdb+0x94aa21)
    #50 stdin_event_handler gdb/ui.c:155 (gdb+0x10751a0)
    #51 handle_file_event gdbsupport/event-loop.cc:573 (gdb+0x1d95bac)
    #52 gdb_wait_for_event gdbsupport/event-loop.cc:694 (gdb+0x1d962e4)
    #53 gdb_do_one_event(int) gdbsupport/event-loop.cc:264 (gdb+0x1d946d0)
    #54 start_event_loop gdb/main.c:412 (gdb+0xb5ab52)
    #55 captured_command_loop gdb/main.c:476 (gdb+0xb5ad41)
    #56 captured_main gdb/main.c:1320 (gdb+0xb5cec1)
    #57 gdb_main(captured_main_args*) gdb/main.c:1339 (gdb+0xb5cf70)
    #58 main gdb/gdb.c:32 (gdb+0x416776)

  Previous read of size 1 at 0x7b200000420d by thread T11:
    #0 write_gdbindex gdb/dwarf2/index-write.c:1229 (gdb+0x831630)
    #1 write_dwarf_index(dwarf2_per_bfd*, char const*, char const*, char const*, dw_index_kind) gdb/dwarf2/index-write.c:1484 (gdb+0x832897)
    #2 index_cache::store(dwarf2_per_bfd*, index_cache_store_context const&) gdb/dwarf2/index-cache.c:173 (gdb+0x82db8d)
    #3 cooked_index::maybe_write_index(dwarf2_per_bfd*, index_cache_store_context const&) gdb/dwarf2/cooked-index.c:645 (gdb+0x7f1d49)
    #4 operator() gdb/dwarf2/cooked-index.c:474 (gdb+0x7f0f31)
    #5 _M_invoke /usr/include/c++/7/bits/std_function.h:316 (gdb+0x7f2a13)
    #6 std::function<void ()>::operator()() const /usr/include/c++/7/bits/std_function.h:706 (gdb+0x700952)
    #7 void std::__invoke_impl<void, std::function<void ()>&>(std::__invoke_other, std::function<void ()>&) /usr/include/c++/7/bits/invoke.h:60 (gdb+0x7381a0)
    #8 std::__invoke_result<std::function<void ()>&>::type std::__invoke<std::function<void ()>&>(std::function<void ()>&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x737e91)
    #9 std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}::operator()() const /usr/include/c++/7/future:1421 (gdb+0x737b59)
    #10 std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}, void>::operator()() const /usr/include/c++/7/future:1362 (gdb+0x738660)
    #11 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}, void> >::_M_invoke(std::_Any_data const&) /usr/include/c++/7/bits/std_function.h:302 (gdb+0x73825c)
    #12 std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>::operator()() const /usr/include/c++/7/bits/std_function.h:706 (gdb+0x733623)
    #13 std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) /usr/include/c++/7/future:561 (gdb+0x732bdf)
    #14 void std::__invoke_impl<void, void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::__invoke_memfun_deref, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/bits/invoke.h:73 (gdb+0x734c4f)
    #15 std::__invoke_result<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>::type std::__invoke<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x733bc5)
    #16 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#1}::operator()() const /usr/include/c++/7/mutex:672 (gdb+0x73300d)
    #17 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#2}::operator()() const /usr/include/c++/7/mutex:677 (gdb+0x7330b2)
    #18 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#2}::_FUN() /usr/include/c++/7/mutex:677 (gdb+0x7330f2)
    #19 pthread_once <null> (libtsan.so.0+0x4457c)
    #20 __gthread_once /usr/include/c++/7/x86_64-suse-linux/bits/gthr-default.h:699 (gdb+0x72f5dd)
    #21 void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/mutex:684 (gdb+0x733224)
    #22 std::__future_base::_State_baseV2::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>, bool) /usr/include/c++/7/future:401 (gdb+0x732852)
    #23 std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run() /usr/include/c++/7/future:1423 (gdb+0x737bef)
    #24 std::packaged_task<void ()>::operator()() /usr/include/c++/7/future:1556 (gdb+0x1dad25a)
    #25 gdb::thread_pool::thread_function() gdbsupport/thread-pool.cc:242 (gdb+0x1dacb7c)
    #26 void std::__invoke_impl<void, void (gdb::thread_pool::*)(), gdb::thread_pool*>(std::__invoke_memfun_deref, void (gdb::thread_pool::*&&)(), gdb::thread_pool*&&) /usr/include/c++/7/bits/invoke.h:73 (gdb+0x1dadc2b)
    #27 std::__invoke_result<void (gdb::thread_pool::*)(), gdb::thread_pool*>::type std::__invoke<void (gdb::thread_pool::*)(), gdb::thread_pool*>(void (gdb::thread_pool::*&&)(), gdb::thread_pool*&&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x1dad05c)
    #28 decltype (__invoke((_S_declval<0ul>)(), (_S_declval<1ul>)())) std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/include/c++/7/thread:234 (gdb+0x1db038e)
    #29 std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> >::operator()() /usr/include/c++/7/thread:243 (gdb+0x1db0319)
    #30 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> > >::_M_run() /usr/include/c++/7/thread:186 (gdb+0x1db02ce)
    #31 <null> <null> (libstdc++.so.6+0xdcac2)
  ...
SUMMARY: ThreadSanitizer: data race gdb/dwarf2/read.c:21513 in dwarf2_per_cu_data::get_header() const
...

The race happens when issuing the "file $exec" command.

The race is between:
- a worker thread writing the index cache, and in the process reading
   dwarf2_per_cu_data::is_debug_type, and
- the main thread writing to dwarf2_per_cu_data::m_header_read_in.

The two bitfields dwarf2_per_cu_data::m_header_read_in and
dwarf2_per_cu_data::is_debug_type share the same bitfield container.

Fix this by making dwarf2_per_cu_data::m_header_read_in a packed<bool, 1>.

Tested on x86_64-linux.

Approved-By: Tom Tromey <tom@tromey.com>

PR symtab/30392
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30392
tromey pushed a commit that referenced this issue Aug 4, 2023
With gdb build with -fsanitize=thread, and the exec from test-case
gdb.base/index-cache.exp, I run into:
...
$ rm -f ~/.cache/gdb/*; \
  gdb -q -batch -iex "set index-cache enabled on" index-cache \
    -ex "print foobar"
  ...
WARNING: ThreadSanitizer: data race (pid=23970)
  Write of size 1 at 0x7b200000410d by main thread:
    #0 dw_expand_symtabs_matching_file_matcher(dwarf2_per_objfile*, gdb::function_view<bool (char const*, bool)>) gdb/dwarf2/read.c:3077 (gdb+0x7ac54e)
    #1 cooked_index_functions::expand_symtabs_matching(objfile*, gdb::function_view<bool (char const*, bool)>, lookup_name_info const*, gdb::function_view<bool (char const*)>, gdb::function_view<bool (compunit_symtab*)>, enum_flags<block_search_flag_values>, domain_enum, search_domain) gdb/dwarf2/read.c:16812 (gdb+0x7d039f)
    #2 objfile::map_symtabs_matching_filename(char const*, char const*, gdb::function_view<bool (symtab*)>) gdb/symfile-debug.c:219 (gdb+0xda5aee)
    #3 iterate_over_symtabs(char const*, gdb::function_view<bool (symtab*)>) gdb/symtab.c:648 (gdb+0xdc439d)
    #4 lookup_symtab(char const*) gdb/symtab.c:662 (gdb+0xdc44a2)
    #5 classify_name gdb/c-exp.y:3083 (gdb+0x61afec)
    #6 c_yylex gdb/c-exp.y:3251 (gdb+0x61dd13)
    #7 c_yyparse() build/gdb/c-exp.c.tmp:1988 (gdb+0x61f07e)
    #8 c_parse(parser_state*) gdb/c-exp.y:3417 (gdb+0x62d864)
    #9 language_defn::parser(parser_state*) const gdb/language.c:598 (gdb+0x9771c5)
    #10 parse_exp_in_context gdb/parse.c:414 (gdb+0xb10a9b)
    #11 parse_expression(char const*, innermost_block_tracker*, enum_flags<parser_flag>) gdb/parse.c:462 (gdb+0xb110ae)
    #12 process_print_command_args gdb/printcmd.c:1321 (gdb+0xb4bf0c)
    #13 print_command_1 gdb/printcmd.c:1335 (gdb+0xb4ca2a)
    #14 print_command gdb/printcmd.c:1468 (gdb+0xb4cd5a)
    #15 do_simple_func gdb/cli/cli-decode.c:95 (gdb+0x65b078)
    #16 cmd_func(cmd_list_element*, char const*, int) gdb/cli/cli-decode.c:2735 (gdb+0x65ed53)
    #17 execute_command(char const*, int) gdb/top.c:575 (gdb+0xe3a76a)
    #18 catch_command_errors gdb/main.c:518 (gdb+0xa1837d)
    #19 execute_cmdargs gdb/main.c:617 (gdb+0xa1853f)
    #20 captured_main_1 gdb/main.c:1289 (gdb+0xa1aa58)
    #21 captured_main gdb/main.c:1310 (gdb+0xa1b95a)
    #22 gdb_main(captured_main_args*) gdb/main.c:1339 (gdb+0xa1b95a)
    #23 main gdb/gdb.c:39 (gdb+0x42506a)

  Previous read of size 1 at 0x7b200000410d by thread T1:
    #0 write_gdbindex gdb/dwarf2/index-write.c:1214 (gdb+0x75bb30)
    #1 write_dwarf_index(dwarf2_per_bfd*, char const*, char const*, char const*, dw_index_kind) gdb/dwarf2/index-write.c:1469 (gdb+0x75f803)
    #2 index_cache::store(dwarf2_per_bfd*, index_cache_store_context const&) gdb/dwarf2/index-cache.c:173 (gdb+0x755a36)
    #3 cooked_index::maybe_write_index(dwarf2_per_bfd*, index_cache_store_context const&) gdb/dwarf2/cooked-index.c:642 (gdb+0x71c96d)
    #4 operator() gdb/dwarf2/cooked-index.c:471 (gdb+0x71c96d)
    #5 _M_invoke /usr/include/c++/7/bits/std_function.h:316 (gdb+0x71c96d)
    #6 std::function<void ()>::operator()() const /usr/include/c++/7/bits/std_function.h:706 (gdb+0x72a57c)
    #7 void std::__invoke_impl<void, std::function<void ()>&>(std::__invoke_other, std::function<void ()>&) /usr/include/c++/7/bits/invoke.h:60 (gdb+0x72a5db)
    #8 std::__invoke_result<std::function<void ()>&>::type std::__invoke<std::function<void ()>&>(std::function<void ()>&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x72a5db)
    #9 std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}::operator()() const /usr/include/c++/7/future:1421 (gdb+0x72a5db)
    #10 std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}, void>::operator()() const /usr/include/c++/7/future:1362 (gdb+0x72a5db)
    #11 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}, void> >::_M_invoke(std::_Any_data const&) /usr/include/c++/7/bits/std_function.h:302 (gdb+0x72a5db)
    #12 std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>::operator()() const /usr/include/c++/7/bits/std_function.h:706 (gdb+0x724954)
    #13 std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) /usr/include/c++/7/future:561 (gdb+0x724954)
    #14 void std::__invoke_impl<void, void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::__invoke_memfun_deref, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/bits/invoke.h:73 (gdb+0x72434a)
    #15 std::__invoke_result<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>::type std::__invoke<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x72434a)
    #16 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#1}::operator()() const /usr/include/c++/7/mutex:672 (gdb+0x72434a)
    #17 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#2}::operator()() const /usr/include/c++/7/mutex:677 (gdb+0x72434a)
    #18 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#2}::_FUN() /usr/include/c++/7/mutex:677 (gdb+0x72434a)
    #19 pthread_once <null> (libtsan.so.0+0x4457c)
    #20 __gthread_once /usr/include/c++/7/x86_64-suse-linux/bits/gthr-default.h:699 (gdb+0x72532b)
    #21 void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/mutex:684 (gdb+0x72532b)
    #22 std::__future_base::_State_baseV2::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>, bool) /usr/include/c++/7/future:401 (gdb+0x174568d)
    #23 std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run() /usr/include/c++/7/future:1423 (gdb+0x174568d)
    #24 std::packaged_task<void ()>::operator()() /usr/include/c++/7/future:1556 (gdb+0x174568d)
    #25 gdb::thread_pool::thread_function() gdbsupport/thread-pool.cc:242 (gdb+0x174568d)
    #26 void std::__invoke_impl<void, void (gdb::thread_pool::*)(), gdb::thread_pool*>(std::__invoke_memfun_deref, void (gdb::thread_pool::*&&)(), gdb::thread_pool*&&) /usr/include/c++/7/bits/invoke.h:73 (gdb+0x1748040)
    #27 std::__invoke_result<void (gdb::thread_pool::*)(), gdb::thread_pool*>::type std::__invoke<void (gdb::thread_pool::*)(), gdb::thread_pool*>(void (gdb::thread_pool::*&&)(), gdb::thread_pool*&&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x1748040)
    #28 decltype (__invoke((_S_declval<0ul>)(), (_S_declval<1ul>)())) std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/include/c++/7/thread:234 (gdb+0x1748040)
    #29 std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> >::operator()() /usr/include/c++/7/thread:243 (gdb+0x1748040)
    #30 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> > >::_M_run() /usr/include/c++/7/thread:186 (gdb+0x1748040)
    #31 <null> <null> (libstdc++.so.6+0xdcac2)
  ...
SUMMARY: ThreadSanitizer: data race gdb/dwarf2/read.c:3077 in dw_expand_symtabs_matching_file_matcher(dwarf2_per_objfile*, gdb::function_view<bool (char const*, bool)>)
...

The race happens when issuing the "file $exec" command.

The race is between:
- a worker thread writing the index cache, and in the process reading
  dwarf2_per_cu_data::is_debug_type, and
- the main thread writing to dwarf2_per_cu_data::mark.

The two bitfields dwarf2_per_cu_data::mark and
dwarf2_per_cu_data::is_debug_type share the same bitfield container.

Fix this by making dwarf2_per_cu_data::mark a packed<unsigned int, 1>.

Tested on x86_64-linux.

PR symtab/30718
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30718
tromey pushed a commit that referenced this issue Aug 4, 2023
…g_types}

With gdb build with -fsanitize=thread, and the exec from test-case
gdb.base/index-cache.exp, I run into:
...
$ rm -f ~/.cache/gdb/*; \
  gdb -q -batch -iex "set index-cache enabled on" index-cache \
    -ex "print foobar"
  ...
WARNING: ThreadSanitizer: data race (pid=25018)
  Write of size 1 at 0x7b200000410d by main thread:
    #0 dw2_get_file_names_reader gdb/dwarf2/read.c:2033 (gdb+0x7ab023)
    #1 dw2_get_file_names gdb/dwarf2/read.c:2130 (gdb+0x7ab023)
    #2 dw_expand_symtabs_matching_file_matcher(dwarf2_per_objfile*, gdb::function_view<bool (char const*, bool)>) gdb/dwarf2/read.c:3105 (gdb+0x7ac6e9)
    #3 cooked_index_functions::expand_symtabs_matching(objfile*, gdb::function_view<bool (char const*, bool)>, lookup_name_info const*, gdb::function_view<bool (char const*)>, gdb::function_view<bool (compunit_symtab*)>, enum_flags<block_search_flag_values>, domain_enum, search_domain) gdb/dwarf2/read.c:16812 (gdb+0x7d040f)
    #4 objfile::map_symtabs_matching_filename(char const*, char const*, gdb::function_view<bool (symtab*)>) gdb/symfile-debug.c:219 (gdb+0xda5b6e)
    #5 iterate_over_symtabs(char const*, gdb::function_view<bool (symtab*)>) gdb/symtab.c:648 (gdb+0xdc441d)
    #6 lookup_symtab(char const*) gdb/symtab.c:662 (gdb+0xdc4522)
    #7 classify_name gdb/c-exp.y:3083 (gdb+0x61afec)
    #8 c_yylex gdb/c-exp.y:3251 (gdb+0x61dd13)
    #9 c_yyparse() build/gdb/c-exp.c.tmp:1988 (gdb+0x61f07e)
    #10 c_parse(parser_state*) gdb/c-exp.y:3417 (gdb+0x62d864)
    #11 language_defn::parser(parser_state*) const gdb/language.c:598 (gdb+0x977245)
    #12 parse_exp_in_context gdb/parse.c:414 (gdb+0xb10b1b)
    #13 parse_expression(char const*, innermost_block_tracker*, enum_flags<parser_flag>) gdb/parse.c:462 (gdb+0xb1112e)
    #14 process_print_command_args gdb/printcmd.c:1321 (gdb+0xb4bf8c)
    #15 print_command_1 gdb/printcmd.c:1335 (gdb+0xb4caaa)
    #16 print_command gdb/printcmd.c:1468 (gdb+0xb4cdda)
    #17 do_simple_func gdb/cli/cli-decode.c:95 (gdb+0x65b078)
    #18 cmd_func(cmd_list_element*, char const*, int) gdb/cli/cli-decode.c:2735 (gdb+0x65ed53)
    #19 execute_command(char const*, int) gdb/top.c:575 (gdb+0xe3a7ea)
    #20 catch_command_errors gdb/main.c:518 (gdb+0xa183fd)
    #21 execute_cmdargs gdb/main.c:617 (gdb+0xa185bf)
    #22 captured_main_1 gdb/main.c:1289 (gdb+0xa1aad8)
    #23 captured_main gdb/main.c:1310 (gdb+0xa1b9da)
    #24 gdb_main(captured_main_args*) gdb/main.c:1339 (gdb+0xa1b9da)
    #25 main gdb/gdb.c:39 (gdb+0x42506a)

  Previous read of size 1 at 0x7b200000410d by thread T2:
    #0 write_gdbindex gdb/dwarf2/index-write.c:1214 (gdb+0x75bb30)
    #1 write_dwarf_index(dwarf2_per_bfd*, char const*, char const*, char const*, dw_index_kind) gdb/dwarf2/index-write.c:1469 (gdb+0x75f803)
    #2 index_cache::store(dwarf2_per_bfd*, index_cache_store_context const&) gdb/dwarf2/index-cache.c:173 (gdb+0x755a36)
    #3 cooked_index::maybe_write_index(dwarf2_per_bfd*, index_cache_store_context const&) gdb/dwarf2/cooked-index.c:642 (gdb+0x71c96d)
    #4 operator() gdb/dwarf2/cooked-index.c:471 (gdb+0x71c96d)
    #5 _M_invoke /usr/include/c++/7/bits/std_function.h:316 (gdb+0x71c96d)
    #6 std::function<void ()>::operator()() const /usr/include/c++/7/bits/std_function.h:706 (gdb+0x72a57c)
    #7 void std::__invoke_impl<void, std::function<void ()>&>(std::__invoke_other, std::function<void ()>&) /usr/include/c++/7/bits/invoke.h:60 (gdb+0x72a5db)
    #8 std::__invoke_result<std::function<void ()>&>::type std::__invoke<std::function<void ()>&>(std::function<void ()>&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x72a5db)
    #9 std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}::operator()() const /usr/include/c++/7/future:1421 (gdb+0x72a5db)
    #10 std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}, void>::operator()() const /usr/include/c++/7/future:1362 (gdb+0x72a5db)
    #11 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run()::{lambda()#1}, void> >::_M_invoke(std::_Any_data const&) /usr/include/c++/7/bits/std_function.h:302 (gdb+0x72a5db)
    #12 std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>::operator()() const /usr/include/c++/7/bits/std_function.h:706 (gdb+0x724954)
    #13 std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) /usr/include/c++/7/future:561 (gdb+0x724954)
    #14 void std::__invoke_impl<void, void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::__invoke_memfun_deref, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/bits/invoke.h:73 (gdb+0x72434a)
    #15 std::__invoke_result<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>::type std::__invoke<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x72434a)
    #16 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#1}::operator()() const /usr/include/c++/7/mutex:672 (gdb+0x72434a)
    #17 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#2}::operator()() const /usr/include/c++/7/mutex:677 (gdb+0x72434a)
    #18 std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&)::{lambda()#2}::_FUN() /usr/include/c++/7/mutex:677 (gdb+0x72434a)
    #19 pthread_once <null> (libtsan.so.0+0x4457c)
    #20 __gthread_once /usr/include/c++/7/x86_64-suse-linux/bits/gthr-default.h:699 (gdb+0x72532b)
    #21 void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) /usr/include/c++/7/mutex:684 (gdb+0x72532b)
    #22 std::__future_base::_State_baseV2::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>, bool) /usr/include/c++/7/future:401 (gdb+0x174570d)
    #23 std::__future_base::_Task_state<std::function<void ()>, std::allocator<int>, void ()>::_M_run() /usr/include/c++/7/future:1423 (gdb+0x174570d)
    #24 std::packaged_task<void ()>::operator()() /usr/include/c++/7/future:1556 (gdb+0x174570d)
    #25 gdb::thread_pool::thread_function() gdbsupport/thread-pool.cc:242 (gdb+0x174570d)
    #26 void std::__invoke_impl<void, void (gdb::thread_pool::*)(), gdb::thread_pool*>(std::__invoke_memfun_deref, void (gdb::thread_pool::*&&)(), gdb::thread_pool*&&) /usr/include/c++/7/bits/invoke.h:73 (gdb+0x17480c0)
    #27 std::__invoke_result<void (gdb::thread_pool::*)(), gdb::thread_pool*>::type std::__invoke<void (gdb::thread_pool::*)(), gdb::thread_pool*>(void (gdb::thread_pool::*&&)(), gdb::thread_pool*&&) /usr/include/c++/7/bits/invoke.h:95 (gdb+0x17480c0)
    #28 decltype (__invoke((_S_declval<0ul>)(), (_S_declval<1ul>)())) std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/include/c++/7/thread:234 (gdb+0x17480c0)
    #29 std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> >::operator()() /usr/include/c++/7/thread:243 (gdb+0x17480c0)
    #30 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (gdb::thread_pool::*)(), gdb::thread_pool*> > >::_M_run() /usr/include/c++/7/thread:186 (gdb+0x17480c0)
    #31 <null> <null> (libstdc++.so.6+0xdcac2)
  ...
SUMMARY: ThreadSanitizer: data race gdb/dwarf2/read.c:2033 in dw2_get_file_names_reader
...

The race happens when issuing the "file $exec" command.

The race is between:
- a worker thread writing the index cache, and in the process reading
  dwarf2_per_cu_data::is_debug_type, and
- the main thread writing to dwarf2_per_cu_data::files_read.

The two bitfields dwarf2_per_cu_data::files_read and
dwarf2_per_cu_data::is_debug_type share the same bitfield container.

Fix this by making dwarf2_per_cu_data::files_read a packed<bool, 1>.

Tested on x86_64-linux.

PR symtab/30718
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30718
tromey pushed a commit that referenced this issue Oct 1, 2023
This commit fixes an issue that was discovered while writing the tests
for the previous commit.

I noticed that, when GDB restarts an inferior, the executable_changed
event would trigger twice.  The first notification would originate
from:

  #0  exec_file_attach (filename=0x4046680 "/tmp/hello.x", from_tty=0) at ../../src/gdb/exec.c:513
  #1  0x00000000006f3adb in reopen_exec_file () at ../../src/gdb/corefile.c:122
  #2  0x0000000000e6a3f2 in generic_mourn_inferior () at ../../src/gdb/target.c:3682
  #3  0x0000000000995121 in inf_child_target::mourn_inferior (this=0x2fe95c0 <the_amd64_linux_nat_target>) at ../../src/gdb/inf-child.c:192
  #4  0x0000000000995cff in inf_ptrace_target::mourn_inferior (this=0x2fe95c0 <the_amd64_linux_nat_target>) at ../../src/gdb/inf-ptrace.c:125
  #5  0x0000000000a32472 in linux_nat_target::mourn_inferior (this=0x2fe95c0 <the_amd64_linux_nat_target>) at ../../src/gdb/linux-nat.c:3609
  #6  0x0000000000e68a40 in target_mourn_inferior (ptid=...) at ../../src/gdb/target.c:2761
  #7  0x0000000000a323ec in linux_nat_target::kill (this=0x2fe95c0 <the_amd64_linux_nat_target>) at ../../src/gdb/linux-nat.c:3593
  #8  0x0000000000e64d1c in target_kill () at ../../src/gdb/target.c:924
  #9  0x00000000009a19bc in kill_if_already_running (from_tty=1) at ../../src/gdb/infcmd.c:328
  #10 0x00000000009a1a6f in run_command_1 (args=0x0, from_tty=1, run_how=RUN_STOP_AT_MAIN) at ../../src/gdb/infcmd.c:381
  #11 0x00000000009a20a5 in start_command (args=0x0, from_tty=1) at ../../src/gdb/infcmd.c:527
  #12 0x000000000068dc5d in do_simple_func (args=0x0, from_tty=1, c=0x35c7200) at ../../src/gdb/cli/cli-decode.c:95

While the second originates from:

  #0  exec_file_attach (filename=0x3d7a1d0 "/tmp/hello.x", from_tty=0) at ../../src/gdb/exec.c:513
  #1  0x0000000000dfe525 in reread_symbols (from_tty=1) at ../../src/gdb/symfile.c:2517
  #2  0x00000000009a1a98 in run_command_1 (args=0x0, from_tty=1, run_how=RUN_STOP_AT_MAIN) at ../../src/gdb/infcmd.c:398
  #3  0x00000000009a20a5 in start_command (args=0x0, from_tty=1) at ../../src/gdb/infcmd.c:527
  #4  0x000000000068dc5d in do_simple_func (args=0x0, from_tty=1, c=0x35c7200) at ../../src/gdb/cli/cli-decode.c:95

In the first case the call to exec_file_attach first passes through
reopen_exec_file.  The reopen_exec_file performs a modification time
check on the executable file, and only calls exec_file_attach if the
executable has changed on disk since it was last loaded.

However, in the second case things work a little differently.  In this
case GDB is really trying to reread the debug symbol.  As such, we
iterate over the objfiles list, and for each of those we check the
modification time, if the file on disk has changed then we reload the
debug symbols from that file.

However, there is an additional check, if the objfile has the same
name as the executable then we will call exec_file_attach, but we do
so without checking the cached modification time that indicates when
the executable was last reloaded, as a result, we reload the
executable twice.

In this commit I propose that reread_symbols be changed to
unconditionally call reopen_exec_file before performing the objfile
iteration.  This will ensure that, if the executable has changed, then
the executable will be reloaded, however, if the executable has
already been recently reloaded, we will not reload it for a second
time.

After handling the executable, GDB can then iterate over the objfiles
list and reload them in the normal way.

With this done I now see the executable reloaded only once when GDB
restarts an inferior, which means I can remove the kfail that I added
to the gdb.python/py-exec-file.exp test in the previous commit.

Approved-By: Tom Tromey <tom@tromey.com>
tromey pushed a commit that referenced this issue Oct 5, 2023
It was pointed out on the mailing list that a recently added
test (gdb.python/py-progspace-events.exp) was failing when run with
the native-extended-gdbserver board.  This test was added with this
commit:

  commit 59912fb
  Date:   Tue Sep 19 11:45:36 2023 +0100

      gdb: add Python events for program space addition and removal

It turns out though that the test is failing due to a existing bug
in GDB, the new test just exposes the problem.  Additionally, the
failure really doesn't even rely on the new functionality added in the
above commit.  I reduced the test to a simple set of steps that
reproduced the failure and tested against GDB 13, and the test passes;
so the bug was introduced since then.  In fact, the bug was introduced
with this commit:

  commit a282736
  Date:   Fri Sep 8 15:48:16 2023 +0100

      gdb: remove final user of the executable_changed observer

This commit changed how the per-inferior auxv data cache is managed,
specifically, when the cache is cleared, and it is this that leads to
the failure.

This bug is interesting because it exposes a number of issues with
GDB, I'll explain all of the problems I see, though ultimately, I only
propose fixing one problem in this commit, which is enough to resolve
the crash we are currently seeing.

The crash that we are seeing manifests like this:

  ...
  [Inferior 2 (process 3970384) exited normally]
  +inferior 1
  [Switching to inferior 1 [process 3970383] (/tmp/build/gdb/testsuite/outputs/gdb.python/py-progspace-events/py-progspace-events)]
  [Switching to thread 1.1 (Thread 3970383.3970383)]
  #0  breakpt () at /tmp/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.python/py-progspace-events.c:28
  28	{ /* Nothing.  */ }
  (gdb) step
  +step
  terminate called after throwing an instance of 'gdb_exception_error'

  Fatal signal: Aborted
  ... etc ...

What's happening is that GDB attempts to refill the auxv cache as a
result of the gdbarch_has_shared_address_space call in
program_space::~program_space, the backtrace looks like this:

  #0  0x00007fb4f419a9a5 in raise () from /lib64/libpthread.so.0
  #1  0x00000000008b635d in handle_fatal_signal (sig=6) at ../../src/gdb/event-top.c:912
  #2  <signal handler called>
  #3  0x00007fb4f38e3625 in raise () from /lib64/libc.so.6
  #4  0x00007fb4f38cc8d9 in abort () from /lib64/libc.so.6
  #5  0x00007fb4f3c70756 in __gnu_cxx::__verbose_terminate_handler() [clone .cold] () from /lib64/libstdc++.so.6
  #6  0x00007fb4f3c7c6dc in __cxxabiv1::__terminate(void (*)()) () from /lib64/libstdc++.so.6
  #7  0x00007fb4f3c7b6e9 in __cxa_call_terminate () from /lib64/libstdc++.so.6
  #8  0x00007fb4f3c7c094 in __gxx_personality_v0 () from /lib64/libstdc++.so.6
  #9  0x00007fb4f3a80c63 in _Unwind_RaiseException_Phase2 () from /lib64/libgcc_s.so.1
  #10 0x00007fb4f3a8154e in _Unwind_Resume () from /lib64/libgcc_s.so.1
  #11 0x0000000000e8832d in target_read_alloc_1<unsigned char> (ops=0x408a3a0, object=TARGET_OBJECT_AUXV, annex=0x0) at ../../src/gdb/target.c:2266
  #12 0x0000000000e73dea in target_read_alloc (ops=0x408a3a0, object=TARGET_OBJECT_AUXV, annex=0x0) at ../../src/gdb/target.c:2315
  #13 0x000000000058248c in target_read_auxv_raw (ops=0x408a3a0) at ../../src/gdb/auxv.c:379
  #14 0x000000000058243d in target_read_auxv () at ../../src/gdb/auxv.c:368
  #15 0x000000000058255c in target_auxv_search (match=0x0, valp=0x7ffdee17c598) at ../../src/gdb/auxv.c:415
  #16 0x0000000000a464bb in linux_is_uclinux () at ../../src/gdb/linux-tdep.c:433
  #17 0x0000000000a464f6 in linux_has_shared_address_space (gdbarch=0x409a2d0) at ../../src/gdb/linux-tdep.c:440
  #18 0x0000000000510eae in gdbarch_has_shared_address_space (gdbarch=0x409a2d0) at ../../src/gdb/gdbarch.c:4889
  #19 0x0000000000bc7558 in program_space::~program_space (this=0x4544aa0, __in_chrg=<optimized out>) at ../../src/gdb/progspace.c:124
  #20 0x00000000009b245d in delete_inferior (inf=0x47b3de0) at ../../src/gdb/inferior.c:290
  #21 0x00000000009b2c10 in prune_inferiors () at ../../src/gdb/inferior.c:480
  #22 0x00000000009c5e3e in fetch_inferior_event () at ../../src/gdb/infrun.c:4558
  #23 0x000000000099b4dc in inferior_event_handler (event_type=INF_REG_EVENT) at ../../src/gdb/inf-loop.c:42
  #24 0x0000000000cbc64f in remote_async_serial_handler (scb=0x4090a30, context=0x408a6b0) at ../../src/gdb/remote.c:14859
  #25 0x0000000000d83d3a in run_async_handler_and_reschedule (scb=0x4090a30) at ../../src/gdb/ser-base.c:138
  #26 0x0000000000d83e1f in fd_event (error=0, context=0x4090a30) at ../../src/gdb/ser-base.c:189

So this is problem #1, if we throw an exception while deleting a
program_space then this is not caught, and is going to crash GDB.

Problem #2 becomes evident when we ask why GDB is throwing an error in
this case; the error is thrown because the remote target, operating in
non-async mode, can't read the auxv data while an inferior is running
and GDB is waiting for a stop reply.  The problem here then, is why
does GDB get into a position where it tries to interact with the
remote target in this way, at this time?  The problem is caused by the
prune_inferiors call which can be seen in the above backtrace.

In prune_inferiors we check if the inferior is deletable, and if it
is, we delete it.  The problem is, I think, we should also check if
the target is currently in a state that would allow us to delete the
inferior.  We don't currently have such a check available, we'd need
to add one, but for the remote target, this would return false if the
remote is in async mode and the remote is currently waiting for a stop
reply.  With this change in place GDB would defer deleting the
inferior until the remote target has stopped, at which point GDB would
be able to refill the auxv cache successfully.

And then, problem #3 becomes evident when we ask why GDB is needing to
refill the auxv cache now when it didn't need to for GDB 13.  This is
where the second commit mentioned above (a282736) comes in.
Prior to this commit, the auxv cache was cleared by the
executable_changed observer, while after that commit the auxv cache
was cleared by the new_objfile observer -- but only when the
new_objfile observer is used in the special mode that actually means
that all objfiles have been unloaded (I know, the overloading of the
new_objfile observer is horrible, and unnecessary, but it's not really
important for this bug).

The difference arises because the new_objfile observer is triggered
from clear_symtab_users, which in turn is called from
program_space::~program_space.  The new_objfile observer for auxv does
this:

  static void
  auxv_new_objfile_observer (struct objfile *objfile)
  {
    if (objfile == nullptr)
      invalidate_auxv_cache_inf (current_inferior ());
  }

That is, when all the objfiles are unloaded, we clear the auxv cache
for the current inferior.

The problem is, then when we look at the prune_inferiors ->
delete_inferior -> ~program_space path, we see that the current
inferior is not going to be an inferior that exists within the
program_space being deleted; delete_inferior removes the deleted
inferior from the global inferior list, and then only deletes the
program_space if program_space::empty() returns true, which is only
the case if the current inferior isn't within the program_space to
delete, and no other inferior exists within that program_space
either.

What this means is that when the new_objfile observer is called we
can't rely on the current inferior having any relationship with the
program space in which the objfiles were removed.  This was an error
in the commit a282736, the only thing we can rely on is the
current program space.  As a result of this mistake, after commit
a282736, GDB was sometimes clearing the auxv cache for a random
inferior.  In the native target case this was harmless as we can
always refill the cache when needed, but in the remote target case, if
we need to refill the cache when the remote target is executing, then
we get the crash we observed.

And additionally, if we think about this a little more, we see that
commit a282736 made another mistake.  When all the objfiles are
removed, they are removed from a program_space, a program_space might
contain multiple inferiors, so surely, we should clear the auxv cache
for all of the matching inferiors?

Given these two insights, that the current_inferior is not relevant,
only the current_program_space, and that we should be clearing the
cache for all inferiors in the current_program_space, we can update
auxv_new_objfile_observer to:

  if (objfile == nullptr)
    {
      for (inferior *inf : all_inferiors ())
	{
	  if (inf->pspace == current_program_space)
	    invalidate_auxv_cache_inf (inf);
	}
    }

With this change we now correctly clear the auxv cache for the correct
inferiors, and GDB no longer needs to refill the cache at an
inconvenient time, this avoids the crash we were seeing.

And finally, we reach problem #4.  Inspired by the observation that
using the current_inferior from within the ~program_space function was
not correct, I added some debug to see if current_inferior() was
called anywhere else (below ~program_space), and the answer is yes,
it's called a often.  Mostly the culprit is GDB doing:

  current_inferior ()->top_target ()-> ....

But I think all of these calls are most likely doing the wrong thing,
and only work because the top target in all these cases is shared
between all inferiors, e.g. it's the native target, or the remote
target for all inferiors.  But if we had a truly multi-connection
setup, then we might start to see odd behaviour.

Problem #1 I'm just ignoring for now, I guess at some point we might
run into this again, and then we'd need to solve this.  But in this
case I wasn't sure what a "good" solution would look like.  We need
the auxv data in order to implement the linux_is_uclinux() function.
If we can't get the auxv data then what should we do, assume yes, or
assume no?  The right answer would probably be to propagate the error
back up the stack, but then we reach ~program_space, and throwing
exceptions from a destructor is problematic, so we'd need to catch and
deal at this point.  The linux_is_uclinux() call is made from within
gdbarch_has_shared_address_space(), which is used like:

  if (!gdbarch_has_shared_address_space (target_gdbarch ()))
    delete this->aspace;

So, we would have to choose; delete the address space or not.  If we
delete it on error, then we might delete an address space that is
shared within another program space.  If we don't delete the address
space, then we might leak it.  Neither choice is great.

A better solution might be to have the address spaces be reference
counted, then we could remove the gdbarch_has_shared_address_space
call completely, and just rely on the reference count to auto-delete
the address space when appropriate.

The solution for problem #2 I already hinted at above, we should have
a new target_can_delete_inferiors() call, which should be called from
prune_inferiors, this would prevent GDB from trying to delete
inferiors when a (remote) target is in a state where we know it can't
delete the inferior.  Deleting an inferior often (always?) requires
sending packets to the remote, and if the remote is waiting for a stop
reply then this will never work, so the pruning should be deferred in
this case.

The solution for problem #3 is included in this commit.

And, for problem #4, I'm not sure what the right solution is.  Maybe
delete_inferior should ensure the inferior to be deleted is in place
when ~program_space is called?  But that seems a little weird, as the
current inferior would, in theory, still be using the current
program_space...

Anyway, after this commit, the gdb.python/py-progspace-events.exp test
now passes when run with the native-extended-remote board.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30935
Approved-By: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I41f0e6e2d7ecc1e5e55ec170f37acd4052f46eaf
tromey pushed a commit that referenced this issue Nov 14, 2023
Running the
gdb.threads/step-over-thread-exit-while-stop-all-threads.exp testcase
added later in the series against gdbserver, after the
TARGET_WAITKIND_NO_RESUMED fix from the following patch, would run
into an infinite loop in stop_all_threads, leading to a timeout:

  FAIL: gdb.threads/step-over-thread-exit-while-stop-all-threads.exp: displaced-stepping=off: target-non-stop=on: iter 0: continue (timeout)

The is really a latent bug, and it is about the fact that
stop_all_threads stops listening to events from a target as soon as it
sees a TARGET_WAITKIND_NO_RESUMED, ignoring that
TARGET_WAITKIND_NO_RESUMED may be delayed.  handle_no_resumed knows
how to handle delayed no-resumed events, but stop_all_threads was
never taught to.

In more detail, here's what happens with that testcase:

#1 - Multiple threads report breakpoint hits to gdb.

#2 - gdb picks one events, and it's for thread 1.  All other stops are
     left pending.  thread 1 needs to move past a breakpoint, so gdb
     stops all threads to start an inline step over for thread 1.
     While stopping threads, some of the threads that were still
     running report events that are also left pending.

#2 - gdb steps thread 1

#3 - Thread 1 exits while stepping (it steps over an exit syscall),
     gdbserver reports thread exit for thread 1

#4 - Thread 1 was the last resumed thread, so gdbserver also reports
     no-resumed:

    [remote]   Notification received: Stop:w0;p3445d0.3445d3
    [remote] Sending packet: $vStopped#55
    [remote] Packet received: N
    [remote] Sending packet: $vStopped#55
    [remote] Packet received: OK

#5 - gdb processes the thread exit for thread 1, finishes the step
     over and restarts threads.

#6 - gdb picks the next event to process out of one of the resumed
     threads with pending events:

    [infrun] random_resumed_with_pending_wait_status: Found 32 events, selecting #11

#7 - This is again a breakpoint hit and the breakpoint needs to be
     stepped over too, so gdb starts a step-over dance again.

#8 - We reach stop_all_threads, which finds that some threads need to
     be stopped.

#9 - wait_one finally consumes the no-resumed event queue by #4.
     Seeing this, wait_one disable target async, to stop listening for
     events out of the remote target.

#10 - We still haven't seen all the stops expected, so
      stop_all_threads tries another iteration.

#11 - Because the remote target is no longer async, and there are no
      other targets, wait_one return no-resumed immediately without
      polling the remote target.

#12 - We still haven't seen all the stops expected, so
      stop_all_threads tries another iteration.  goto #11, looping
      forever.

Fix this by explicitly enabling/re-enabling target async on targets
that can async, before waiting for stops.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: Ie3ffb0df89635585a6631aa842689cecc989e33f
tromey pushed a commit that referenced this issue Dec 4, 2023
When using GDB on native linux, it can happen that, while attempting
to detach an inferior, the inferior may have been exited or have been
killed, yet still be in the list of lwps.  Should that happen, the
assert in x86_linux_update_debug_registers in
gdb/nat/x86-linux-dregs.c will trigger.  The line in question looks
like this:

  gdb_assert (lwp_is_stopped (lwp));

For this case, the lwp isn't stopped - it's dead.

The bug which brought this problem to my attention is one in which the
pwntools library uses GDB to to debug a process; as the script is
shutting things down, it kills the process that GDB is debugging and
also sends GDB a SIGTERM signal, which causes GDB to detach all
inferiors prior to exiting.  Here's a link to the bug:

https://bugzilla.redhat.com/show_bug.cgi?id=2192169

The following shell command mimics part of what the pwntools
reproducer script does (with regard to shutting things down), but
reproduces the bug much less reliably.  I have found it necessary to
run the command a bunch of times before seeing the bug.  (I usually
see it within 5-10 repetitions.)  If you choose to try this command,
make sure that you have no running "cat" or "gdb" processes first!

  cat </dev/zero >/dev/null & \
  (sleep 5; (kill -KILL `pgrep cat` & kill -TERM `pgrep gdb`)) & \
  sleep 1 ; \
  gdb -q -iex 'set debuginfod enabled off' -ex 'set height 0' \
      -ex c /usr/bin/cat `pgrep cat`

So, basically, the idea here is to kill both gdb and cat at roughly
the same time.  If we happen to attempt the detach before the process
lwp has been deleted from GDB's (linux native) LWP data structures,
then the assert will trigger.  The relevant part of the backtrace
looks like this:

  #8  0x00000000008a83ae in x86_linux_update_debug_registers (lwp=0x1873280)
      at gdb/nat/x86-linux-dregs.c:146
  #9  0x00000000008a862f in x86_linux_prepare_to_resume (lwp=0x1873280)
      at gdb/nat/x86-linux.c:81
  #10 0x000000000048ea42 in x86_linux_nat_target::low_prepare_to_resume (
      this=0x121eee0 <the_amd64_linux_nat_target>, lwp=0x1873280)
      at gdb/x86-linux-nat.h:70
  #11 0x000000000081a452 in detach_one_lwp (lp=0x1873280, signo_p=0x7fff8ca3441c)
      at gdb/linux-nat.c:1374
  #12 0x000000000081a85f in linux_nat_target::detach (
      this=0x121eee0 <the_amd64_linux_nat_target>, inf=0x16e8f70, from_tty=0)
      at gdb/linux-nat.c:1450
  #13 0x000000000083a23b in thread_db_target::detach (
      this=0x1206ae0 <the_thread_db_target>, inf=0x16e8f70, from_tty=0)
      at gdb/linux-thread-db.c:1385
  #14 0x0000000000a66722 in target_detach (inf=0x16e8f70, from_tty=0)
      at gdb/target.c:2526
  #15 0x0000000000a8f0ad in kill_or_detach (inf=0x16e8f70, from_tty=0)
      at gdb/top.c:1659
  #16 0x0000000000a8f4fa in quit_force (exit_arg=0x0, from_tty=0)
      at gdb/top.c:1762
  #17 0x000000000070829c in async_sigterm_handler (arg=0x0)
      at gdb/event-top.c:1141

My colleague, Andrew Burgess, has done some recent work on other
problems with detach.  Upon hearing of this problem, he came up a test
case which reliably reproduces the problem and tests for a few other
problems as well.  In addition to testing detach when the inferior has
terminated due to a signal, it also tests detach when the inferior has
exited normally.  Andrew observed that the linux-native-only
"checkpoint" command would be affected too, so the test also tests
those cases when there's an active checkpoint.

For the LWP exit / termination case with no checkpoint, that's handled
via newly added checks of the waitstatus in detach_one_lwp in
linux-nat.c.

For the checkpoint detach problem, I chose to pass the lwp_info
to linux_fork_detach in linux-fork.c.  With that in place, suitable
tests were added before attempting a PTRACE_DETACH operation.

I added a few asserts at the beginning of linux_fork_detach and
modified the caller code so that the newly added asserts shouldn't
trigger.  (That's what the 'pid == inferior_ptid.pid' check is about
in gdb/linux-nat.c.)

Lastly, I'll note that the checkpoint code needs some work with regard
to background execution.  This patch doesn't attempt to fix that
problem, but it doesn't make it any worse.  It does slightly improve
the situation with detach because, due to the check noted above,
linux_fork_detach() won't be called for the wrong inferior when there
are multiple inferiors.  (There are at least two other problems with
the checkpoint code when there are multiple inferiors.  See:
https://sourceware.org/bugzilla/show_bug.cgi?id=31065)

This commit also adds a new test,
gdb.base/process-dies-while-detaching.exp.  Andrew Burgess is the
primary author of this test case.  Its design is similar to that of
gdb.threads/main-thread-exit-during-detach.exp, which was also written
by Andrew.

This test checks that GDB correctly handles several cases that can
occur when GDB attempts to detach an inferior process.  The process
can exit or be terminated (e.g.  via SIGKILL) prior to GDB's event
loop getting a chance to remove it from GDB's internal data
structures.  To complicate things even more, detach works differently
when a checkpoint (created via GDB's "checkpoint" command) exists for
the inferior.  This test checks all four possibilities: process exit
with no checkpoint, process termination with no checkpoint, process
exit with a checkpoint, and process termination with a checkpoint.

Co-Authored-By: Andrew Burgess <aburgess@redhat.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
tromey pushed a commit that referenced this issue Feb 13, 2024
When running test-case gdb.dap/eof.exp, it occasionally coredumps.

The thread triggering the coredump is:
...
 #0  0x0000ffff42bb2280 in __pthread_kill_implementation () from /lib64/libc.so.6
 #1  0x0000ffff42b65800 [PAC] in raise () from /lib64/libc.so.6
 #2  0x00000000007b03e8 [PAC] in handle_fatal_signal (sig=11)
     at gdb/event-top.c:926
 #3  0x00000000007b0470 in handle_sigsegv (sig=11)
     at gdb/event-top.c:976
 #4  <signal handler called>
 #5  0x0000000000606080 in cli_ui_out::do_message (this=0xffff2f7ed728, style=...,
     format=0xffff0c002af1 "%s", args=...) at gdb/cli-out.c:232
 #6  0x0000000000ce6358 in ui_out::call_do_message (this=0xffff2f7ed728, style=...,
     format=0xffff0c002af1 "%s") at gdb/ui-out.c:584
 #7  0x0000000000ce6610 in ui_out::vmessage (this=0xffff2f7ed728, in_style=...,
     format=0x16f93ea "", args=...) at gdb/ui-out.c:621
 #8  0x0000000000ce3a9c in ui_file::vprintf (this=0xfffffbea1b18, ...)
     at gdb/ui-file.c:74
 #9  0x0000000000d2b148 in gdb_vprintf (stream=0xfffffbea1b18, format=0x16f93e8 "%s",
     args=...) at gdb/utils.c:1898
 #10 0x0000000000d2b23c in gdb_printf (stream=0xfffffbea1b18, format=0x16f93e8 "%s")
     at gdb/utils.c:1913
 #11 0x0000000000ab5208 in gdbpy_write (self=0x33fe35d0, args=0x342ec280, kw=0x345c08b0)
     at gdb/python/python.c:1464
 #12 0x0000ffff434acedc in cfunction_call () from /lib64/libpython3.12.so.1.0
 #13 0x0000ffff4347c500 [PAC] in _PyObject_MakeTpCall ()
     from /lib64/libpython3.12.so.1.0
 #14 0x0000ffff43488b64 [PAC] in _PyEval_EvalFrameDefault ()
    from /lib64/libpython3.12.so.1.0
 #15 0x0000ffff434d8cd0 [PAC] in method_vectorcall () from /lib64/libpython3.12.so.1.0
 #16 0x0000ffff434b9824 [PAC] in PyObject_CallOneArg () from /lib64/libpython3.12.so.1.0
 #17 0x0000ffff43557674 [PAC] in PyFile_WriteObject () from /lib64/libpython3.12.so.1.0
 #18 0x0000ffff435577a0 [PAC] in PyFile_WriteString () from /lib64/libpython3.12.so.1.0
 #19 0x0000ffff43465354 [PAC] in thread_excepthook () from /lib64/libpython3.12.so.1.0
 #20 0x0000ffff434ac6e0 [PAC] in cfunction_vectorcall_O ()
    from /lib64/libpython3.12.so.1.0
 #21 0x0000ffff434a32d8 [PAC] in PyObject_Vectorcall () from /lib64/libpython3.12.so.1.0
 #22 0x0000ffff43488b64 [PAC] in _PyEval_EvalFrameDefault ()
    from /lib64/libpython3.12.so.1.0
 #23 0x0000ffff434d8d88 [PAC] in method_vectorcall () from /lib64/libpython3.12.so.1.0
 #24 0x0000ffff435e0ef4 [PAC] in thread_run () from /lib64/libpython3.12.so.1.0
 #25 0x0000ffff43591ec0 [PAC] in pythread_wrapper () from /lib64/libpython3.12.so.1.0
 #26 0x0000ffff42bb0584 [PAC] in start_thread () from /lib64/libc.so.6
 #27 0x0000ffff42c1fd4c [PAC] in thread_start () from /lib64/libc.so.6
...

The direct cause for the coredump seems to be that cli_ui_out::do_message
is trying to write to a stream variable which does not look sound:
...
(gdb) p *stream
$8 = {_vptr.ui_file = 0x0, m_applied_style = {m_foreground = {m_simple = true, {
        m_value = 0, {m_red = 0 '\000', m_green = 0 '\000', m_blue = 0 '\000'}}},
    m_background = {m_simple = 32, {m_value = 65535, {m_red = 255 '\377',
          m_green = 255 '\377', m_blue = 0 '\000'}}},
    m_intensity = (unknown: 0x438fe710), m_reverse = 255}}
...

The string that is being printed is:
...
(gdb) p str
$9 = "Exception in thread "
...
so AFAICT this is a DAP thread running into an exception and trying to print
it.

If we look at the state of gdb's main thread, we have:
...
 #0  0x0000ffff42bac914 in __futex_abstimed_wait_cancelable64 () from /lib64/libc.so.6
 #1  0x0000ffff42bafb44 [PAC] in pthread_cond_timedwait@@GLIBC_2.17 ()
    from /lib64/libc.so.6
 #2  0x0000ffff43466e9c [PAC] in take_gil () from /lib64/libpython3.12.so.1.0
 #3  0x0000ffff43484fe0 [PAC] in PyEval_RestoreThread ()
     from /lib64/libpython3.12.so.1.0
 #4  0x0000000000ab8698 [PAC] in gdbpy_allow_threads::~gdbpy_allow_threads (
     this=0xfffffbea1cf8, __in_chrg=<optimized out>)
     at gdb/python/python-internal.h:769
 #5  0x0000000000ab2fec in execute_gdb_command (self=0x33fe35d0, args=0x34297b60,
     kw=0x34553d20) at gdb/python/python.c:681
 #6  0x0000ffff434acedc in cfunction_call () from /lib64/libpython3.12.so.1.0
 #7  0x0000ffff4347c500 [PAC] in _PyObject_MakeTpCall ()
     from /lib64/libpython3.12.so.1.0
 #8  0x0000ffff43488b64 [PAC] in _PyEval_EvalFrameDefault ()
    from /lib64/libpython3.12.so.1.0
 #9  0x0000ffff4353bce8 [PAC] in _PyObject_VectorcallTstate.lto_priv.3 ()
    from /lib64/libpython3.12.so.1.0
 #10 0x0000000000ab87fc [PAC] in gdbpy_event::operator() (this=0xffff14005900)
     at gdb/python/python.c:1061
 #11 0x0000000000ab93e8 in std::__invoke_impl<void, gdbpy_event&> (__f=...)
     at /usr/include/c++/13/bits/invoke.h:61
 #12 0x0000000000ab9204 in std::__invoke_r<void, gdbpy_event&> (__fn=...)
     at /usr/include/c++/13/bits/invoke.h:111
 #13 0x0000000000ab8e90 in std::_Function_handler<..>::_M_invoke(...) (...)
     at /usr/include/c++/13/bits/std_function.h:290
 #14 0x000000000062e0d0 in std::function<void ()>::operator()() const (
     this=0xffff14005830) at /usr/include/c++/13/bits/std_function.h:591
 #15 0x0000000000b67f14 in run_events (error=0, client_data=0x0)
     at gdb/run-on-main-thread.c:76
 #16 0x000000000157e290 in handle_file_event (file_ptr=0x33dae3a0, ready_mask=1)
     at gdbsupport/event-loop.cc:573
 #17 0x000000000157e760 in gdb_wait_for_event (block=1)
     at gdbsupport/event-loop.cc:694
 #18 0x000000000157d464 in gdb_do_one_event (mstimeout=-1)
     at gdbsupport/event-loop.cc:264
 #19 0x0000000000943a84 in start_event_loop () at gdb/main.c:401
 #20 0x0000000000943bfc in captured_command_loop () at gdb/main.c:465
 #21 0x000000000094567c in captured_main (data=0xfffffbea23e8)
     at gdb/main.c:1335
 #22 0x0000000000945700 in gdb_main (args=0xfffffbea23e8)
     at gdb/main.c:1354
 #23 0x0000000000423ab4 in main (argc=14, argv=0xfffffbea2578)
     at gdb/gdb.c:39
...

AFAIU, there's a race between the two threads on gdb_stderr:
- the DAP thread samples the gdb_stderr value, and uses it a bit later to
  print to
- the gdb main thread changes the gdb_stderr value forth and back,
  using a temporary value for string capture purposes

The non-sound stream value is caused by gdb_stderr being sampled while
pointing to a str_file object, and used once the str_file object is already
destroyed.

The error here is that the DAP thread attempts to print to gdb_stderr.

Fix this by adding a thread_wrapper that:
- catches all exceptions and logs them to dap.log, and
- while we're at it, logs when exiting
and using the thread_wrapper for each DAP thread.

Tested on aarch64-linux.

Approved-By: Tom Tromey <tom@tromey.com>
tromey pushed a commit that referenced this issue Feb 15, 2024
When running test-case gdb.dap/eof.exp, we're likely to get a coredump due to
a segfault in new_threadstate.

At the point of the core dump, the gdb main thread looks like:
...
 (gdb) bt
 #0  0x0000fffee30d2280 in __pthread_kill_implementation () from /lib64/libc.so.6
 #1  0x0000fffee3085800 [PAC] in raise () from /lib64/libc.so.6
 #2  0x00000000007b03e8 [PAC] in handle_fatal_signal (sig=11)
     at gdb/event-top.c:926
 #3  0x00000000007b0470 in handle_sigsegv (sig=11)
     at gdb/event-top.c:976
 #4  <signal handler called>
 #5  0x0000fffee3a4db14 in new_threadstate () from /lib64/libpython3.12.so.1.0
 #6  0x0000fffee3ab0548 [PAC] in PyGILState_Ensure () from /lib64/libpython3.12.so.1.0
 #7  0x0000000000a6d034 [PAC] in gdbpy_gil::gdbpy_gil (this=0xffffcb279738)
     at gdb/python/python-internal.h:787
 #8  0x0000000000ab87ac in gdbpy_event::~gdbpy_event (this=0xfffea8001ee0,
     __in_chrg=<optimized out>) at gdb/python/python.c:1051
 #9  0x0000000000ab9460 in std::_Function_base::_Base_manager<...>::_M_destroy
     (__victim=...) at /usr/include/c++/13/bits/std_function.h:175
 #10 0x0000000000ab92dc in std::_Function_base::_Base_manager<...>::_M_manager
     (__dest=..., __source=..., __op=std::__destroy_functor)
     at /usr/include/c++/13/bits/std_function.h:203
 #11 0x0000000000ab8f14 in std::_Function_handler<...>::_M_manager(...) (...)
     at /usr/include/c++/13/bits/std_function.h:282
 #12 0x000000000042dd9c in std::_Function_base::~_Function_base (this=0xfffea8001c10,
     __in_chrg=<optimized out>) at /usr/include/c++/13/bits/std_function.h:244
 #13 0x000000000042e654 in std::function<void ()>::~function() (this=0xfffea8001c10,
     __in_chrg=<optimized out>) at /usr/include/c++/13/bits/std_function.h:334
 #14 0x0000000000b68e60 in std::_Destroy<std::function<void ()> >(...) (...)
     at /usr/include/c++/13/bits/stl_construct.h:151
 #15 0x0000000000b68cd0 in std::_Destroy_aux<false>::__destroy<...>(...) (...)
     at /usr/include/c++/13/bits/stl_construct.h:163
 #16 0x0000000000b689d8 in std::_Destroy<...>(...) (...)
     at /usr/include/c++/13/bits/stl_construct.h:196
 #17 0x0000000000b68414 in std::_Destroy<...>(...) (...)
     at /usr/include/c++/13/bits/alloc_traits.h:948
 #18 std::vector<...>::~vector() (this=0x2a183c8 <runnables>)
     at /usr/include/c++/13/bits/stl_vector.h:732
 #19 0x0000fffee3088370 in __run_exit_handlers () from /lib64/libc.so.6
 #20 0x0000fffee3088450 [PAC] in exit () from /lib64/libc.so.6
 #21 0x0000000000c95600 [PAC] in quit_force (exit_arg=0x0, from_tty=0)
     at gdb/top.c:1822
 #22 0x0000000000609140 in quit_command (args=0x0, from_tty=0)
     at gdb/cli/cli-cmds.c:508
 #23 0x0000000000c926a4 in quit_cover () at gdb/top.c:300
 #24 0x00000000007b09d4 in async_disconnect (arg=0x0)
     at gdb/event-top.c:1230
 #25 0x0000000000548acc in invoke_async_signal_handlers ()
     at gdb/async-event.c:234
 #26 0x000000000157d2d4 in gdb_do_one_event (mstimeout=-1)
     at gdbsupport/event-loop.cc:199
 #27 0x0000000000943a84 in start_event_loop () at gdb/main.c:401
 #28 0x0000000000943bfc in captured_command_loop () at gdb/main.c:465
 #29 0x000000000094567c in captured_main (data=0xffffcb279d08)
     at gdb/main.c:1335
 #30 0x0000000000945700 in gdb_main (args=0xffffcb279d08)
     at gdb/main.c:1354
 #31 0x0000000000423ab4 in main (argc=14, argv=0xffffcb279e98)
     at gdb/gdb.c:39
...

The direct cause of the segfault is calling PyGILState_Ensure after
calling Py_Finalize.

AFAICT the problem is a race between the gdb main thread and DAP's JSON writer
thread.

On one side, we have the following events:
- DAP's JSON reader thread reads an EOF, and lets DAP's main thread known
  by writing None into read_queue
- DAP's main thread lets DAP's JSON writer thread known by writing None into
  write_queue
- DAP's JSON writer thread sees the None in its queue, and calls
  send_gdb("quit")
- a corresponding gdbpy_event is deposited in the runnables vector, to be
  run by the gdb main thread

On the other side, we have the following events:
- the gdb main thread receives a SIGHUP
- the corresponding handler calls quit_force, which calls do_final_cleanups
- one of the final cleanups is finalize_python, which calls Py_Finalize
- quit_force calls exit, which triggers the exit handlers
- one of the exit handlers is the destructor of the runnables vector
- destruction of the vector triggers destruction of the remaining element
- the remaining element is a gdbpy_event, and the destructor (indirectly)
  calls PyGILState_Ensure

It's good to note that both events (EOF and SIGHUP) are caused by this line in
the test-case:
...
catch "close -i $gdb_spawn_id"
...
where "expect close" closes the stdin and stdout file descriptors, which
causes the SIGHUP to be send.

So, for the system I'm running this on, the send_gdb("quit") is actually not
needed.

I'm not sure if we support any systems where it's actually needed.

Fix this by removing the send_gdb("quit").

Tested on aarch64-linux.

PR dap/31306
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31306
tromey pushed a commit that referenced this issue Feb 20, 2024
When building gdb with -O0 -fsanitize=address, and running test-case
gdb.ada/uninitialized_vars.exp, I run into:
...
(gdb) info locals
a = 0
z = (a => 1, b => false, c => 2.0)
=================================================================
==66372==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000097f58 at pc 0xffff52c0da1c bp 0xffffc90a1d40 sp 0xffffc90a1d80
READ of size 4 at 0x602000097f58 thread T0
    #0 0xffff52c0da18 in memmove (/lib64/libasan.so.8+0x6da18)
    #1 0xbcab24 in unsigned char* std::__copy_move_backward<false, true, std::random_access_iterator_tag>::__copy_move_b<unsigned char const, unsigned char>(unsigned char const*, unsigned char const*, unsigned char*) /usr/include/c++/13/bits/stl_algobase.h:748
    #2 0xbc9bf4 in unsigned char* std::__copy_move_backward_a2<false, unsigned char const*, unsigned char*>(unsigned char const*, unsigned char const*, unsigned char*) /usr/include/c++/13/bits/stl_algobase.h:769
    #3 0xbc898c in unsigned char* std::__copy_move_backward_a1<false, unsigned char const*, unsigned char*>(unsigned char const*, unsigned char const*, unsigned char*) /usr/include/c++/13/bits/stl_algobase.h:778
    #4 0xbc715c in unsigned char* std::__copy_move_backward_a<false, unsigned char const*, unsigned char*>(unsigned char const*, unsigned char const*, unsigned char*) /usr/include/c++/13/bits/stl_algobase.h:807
    #5 0xbc4e6c in unsigned char* std::copy_backward<unsigned char const*, unsigned char*>(unsigned char const*, unsigned char const*, unsigned char*) /usr/include/c++/13/bits/stl_algobase.h:867
    #6 0xbc2934 in void gdb::copy<unsigned char const, unsigned char>(gdb::array_view<unsigned char const>, gdb::array_view<unsigned char>) gdb/../gdbsupport/array-view.h:223
    #7 0x20e0100 in value::contents_copy_raw(value*, long, long, long) gdb/value.c:1239
    #8 0x20e9830 in value::primitive_field(long, int, type*) gdb/value.c:3078
    #9 0x20e98f8 in value_field(value*, int) gdb/value.c:3095
    #10 0xcafd64 in print_field_values gdb/ada-valprint.c:658
    #11 0xcb0fa0 in ada_val_print_struct_union gdb/ada-valprint.c:857
    #12 0xcb1bb4 in ada_value_print_inner(value*, ui_file*, int, value_print_options const*) gdb/ada-valprint.c:1042
    #13 0xc66e04 in ada_language::value_print_inner(value*, ui_file*, int, value_print_options const*) const (/home/vries/gdb/build/gdb/gdb+0xc66e04)
    #14 0x20ca1e8 in common_val_print(value*, ui_file*, int, value_print_options const*, language_defn const*) gdb/valprint.c:1092
    #15 0x20caabc in common_val_print_checked(value*, ui_file*, int, value_print_options const*, language_defn const*) gdb/valprint.c:1184
    #16 0x196c524 in print_variable_and_value(char const*, symbol*, frame_info_ptr, ui_file*, int) gdb/printcmd.c:2355
    #17 0x1d99ca0 in print_variable_and_value_data::operator()(char const*, symbol*) gdb/stack.c:2308
    #18 0x1dabca0 in gdb::function_view<void (char const*, symbol*)>::bind<print_variable_and_value_data>(print_variable_and_value_data&)::{lambda(gdb::fv_detail::erased_callable, char const*, symbol*)#1}::operator()(gdb::fv_detail::erased_callable, char const*, symbol*) const gdb/../gdbsupport/function-view.h:305
    #19 0x1dabd14 in gdb::function_view<void (char const*, symbol*)>::bind<print_variable_and_value_data>(print_variable_and_value_data&)::{lambda(gdb::fv_detail::erased_callable, char const*, symbol*)#1}::_FUN(gdb::fv_detail::erased_callable, char const*, symbol*) gdb/../gdbsupport/function-view.h:299
    #20 0x1dab34c in gdb::function_view<void (char const*, symbol*)>::operator()(char const*, symbol*) const gdb/../gdbsupport/function-view.h:289
    #21 0x1d9963c in iterate_over_block_locals gdb/stack.c:2240
    #22 0x1d99790 in iterate_over_block_local_vars(block const*, gdb::function_view<void (char const*, symbol*)>) gdb/stack.c:2259
    #23 0x1d9a598 in print_frame_local_vars gdb/stack.c:2380
    #24 0x1d9afac in info_locals_command(char const*, int) gdb/stack.c:2458
    #25 0xfd7b30 in do_simple_func gdb/cli/cli-decode.c:95
    #26 0xfe5a2c in cmd_func(cmd_list_element*, char const*, int) gdb/cli/cli-decode.c:2735
    #27 0x1f03790 in execute_command(char const*, int) gdb/top.c:575
    #28 0x1384080 in command_handler(char const*) gdb/event-top.c:566
    #29 0x1384e2c in command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) gdb/event-top.c:802
    #30 0x1f731e4 in tui_command_line_handler gdb/tui/tui-interp.c:104
    #31 0x1382a58 in gdb_rl_callback_handler gdb/event-top.c:259
    #32 0x21dbb80 in rl_callback_read_char readline/readline/callback.c:290
    #33 0x1382510 in gdb_rl_callback_read_char_wrapper_noexcept gdb/event-top.c:195
    #34 0x138277c in gdb_rl_callback_read_char_wrapper gdb/event-top.c:234
    #35 0x1fe9b40 in stdin_event_handler gdb/ui.c:155
    #36 0x35ff1bc in handle_file_event gdbsupport/event-loop.cc:573
    #37 0x35ff9d8 in gdb_wait_for_event gdbsupport/event-loop.cc:694
    #38 0x35fd284 in gdb_do_one_event(int) gdbsupport/event-loop.cc:264
    #39 0x1768080 in start_event_loop gdb/main.c:408
    #40 0x17684c4 in captured_command_loop gdb/main.c:472
    #41 0x176cfc8 in captured_main gdb/main.c:1342
    #42 0x176d088 in gdb_main(captured_main_args*) gdb/main.c:1361
    #43 0xb73edc in main gdb/gdb.c:39
    #44 0xffff519b09d8 in __libc_start_call_main (/lib64/libc.so.6+0x309d8)
    #45 0xffff519b0aac in __libc_start_main@@GLIBC_2.34 (/lib64/libc.so.6+0x30aac)
    #46 0xb73c2c in _start (/home/vries/gdb/build/gdb/gdb+0xb73c2c)

0x602000097f58 is located 0 bytes after 8-byte region [0x602000097f50,0x602000097f58)
allocated by thread T0 here:
    #0 0xffff52c65218 in calloc (/lib64/libasan.so.8+0xc5218)
    #1 0xcbc278 in xcalloc gdb/alloc.c:97
    #2 0x35f21e8 in xzalloc(unsigned long) gdbsupport/common-utils.cc:29
    #3 0x20de270 in value::allocate_contents(bool) gdb/value.c:937
    #4 0x20edc08 in value::fetch_lazy() gdb/value.c:4033
    #5 0x20dadc0 in value::entirely_covered_by_range_vector(std::vector<range, std::allocator<range> > const&) gdb/value.c:229
    #6 0xcb2298 in value::entirely_optimized_out() gdb/value.h:560
    #7 0x20ca6fc in value_check_printable gdb/valprint.c:1133
    #8 0x20caa8c in common_val_print_checked(value*, ui_file*, int, value_print_options const*, language_defn const*) gdb/valprint.c:1182
    #9 0x196c524 in print_variable_and_value(char const*, symbol*, frame_info_ptr, ui_file*, int) gdb/printcmd.c:2355
    #10 0x1d99ca0 in print_variable_and_value_data::operator()(char const*, symbol*) gdb/stack.c:2308
    #11 0x1dabca0 in gdb::function_view<void (char const*, symbol*)>::bind<print_variable_and_value_data>(print_variable_and_value_data&)::{lambda(gdb::fv_detail::erased_callable, char const*, symbol*)#1}::operator()(gdb::fv_detail::erased_callable, char const*, symbol*) const gdb/../gdbsupport/function-view.h:305
    #12 0x1dabd14 in gdb::function_view<void (char const*, symbol*)>::bind<print_variable_and_value_data>(print_variable_and_value_data&)::{lambda(gdb::fv_detail::erased_callable, char const*, symbol*)#1}::_FUN(gdb::fv_detail::erased_callable, char const*, symbol*) gdb/../gdbsupport/function-view.h:299
    #13 0x1dab34c in gdb::function_view<void (char const*, symbol*)>::operator()(char const*, symbol*) const gdb/../gdbsupport/function-view.h:289
    #14 0x1d9963c in iterate_over_block_locals gdb/stack.c:2240
    #15 0x1d99790 in iterate_over_block_local_vars(block const*, gdb::function_view<void (char const*, symbol*)>) gdb/stack.c:2259
    #16 0x1d9a598 in print_frame_local_vars gdb/stack.c:2380
    #17 0x1d9afac in info_locals_command(char const*, int) gdb/stack.c:2458
    #18 0xfd7b30 in do_simple_func gdb/cli/cli-decode.c:95
    #19 0xfe5a2c in cmd_func(cmd_list_element*, char const*, int) gdb/cli/cli-decode.c:2735
    #20 0x1f03790 in execute_command(char const*, int) gdb/top.c:575
    #21 0x1384080 in command_handler(char const*) gdb/event-top.c:566
    #22 0x1384e2c in command_line_handler(std::unique_ptr<char, gdb::xfree_deleter<char> >&&) gdb/event-top.c:802
    #23 0x1f731e4 in tui_command_line_handler gdb/tui/tui-interp.c:104
    #24 0x1382a58 in gdb_rl_callback_handler gdb/event-top.c:259
    #25 0x21dbb80 in rl_callback_read_char readline/readline/callback.c:290
    #26 0x1382510 in gdb_rl_callback_read_char_wrapper_noexcept gdb/event-top.c:195
    #27 0x138277c in gdb_rl_callback_read_char_wrapper gdb/event-top.c:234
    #28 0x1fe9b40 in stdin_event_handler gdb/ui.c:155
    #29 0x35ff1bc in handle_file_event gdbsupport/event-loop.cc:573

SUMMARY: AddressSanitizer: heap-buffer-overflow (/lib64/libasan.so.8+0x6da18) in memmove
...

The error happens when trying to print either variable y or y2:
...
   type Variable_Record (A : Boolean := True) is record
      case A is
         when True =>
            B : Integer;
         when False =>
            C : Float;
            D : Integer;
      end case;
   end record;
   Y  : Variable_Record := (A => True, B => 1);
   Y2 : Variable_Record := (A => False, C => 1.0, D => 2);
...
when the variables are uninitialized.

The error happens only when printing the entire variable:
...
(gdb) p y.a
$2 = 216
(gdb) p y.b
There is no member named b.
(gdb) p y.c
$3 = 9.18340949e-41
(gdb) p y.d
$4 = 1
(gdb) p y
<AddressSanitizer: heap-buffer-overflow>
...

The error happens as follows:
- field a functions as discriminant, choosing either the b, or c+d variant.
- when y.a happens to be set to 216, as above, gdb interprets this as the
  variable having the c+d variant (which is why trying to print y.b fails).
- when printing y, gdb allocates a value, copies the bytes into it from the
  target, and then prints the value.
- gdb allocates the value using the type size, which is 8.  It's 8 because
  that's what the DW_AT_byte_size indicates.  Note that for valid values of a,
  it gives correct results: if a is 0 (c+d variant), size is 12, if a is 1
  (b variant), size is 8.
- gdb tries to print field d, which is at an 8 byte offset, and that results
  in a out-of-bounds access for the allocated 8-byte value.

Fix this by handling this case in value::contents_copy_raw, such that we have:
...
(gdb) p y
$1 = (a => 24, c => 9.18340949e-41,
      d => <error reading variable: access outside bounds of object>)
...

An alternative (additional) fix could be this: in compute_variant_fields_inner
gdb reads the discriminant y.a to decide which variant is active.  It would be
nice to detect that the value (y.a == 24) is not a valid Boolean, and give up
on choosing a variant altoghether.  However, the situation regarding the
internal type CODE_TYPE_BOOL is currently ambiguous (see PR31282) and it's not
possible to reliably decide what valid values are.

The test-case source file gdb.ada/uninitialized-variable-record/parse.adb is
a reduced version of gdb.ada/uninitialized_vars/parse.adb, so it copies the
copyright years.

Note that the test-case needs gcc-12 or newer, it's unsupported for older gcc
versions. [ So, it would be nice to rewrite it into a dwarf assembly
test-case. ]

The test-case loops over all languages.  This is inherited from an earlier
attempt to fix this, which had language-specific fixes (in print_field_values,
cp_print_value_fields, pascal_object_print_value_fields and
f_language::value_print_inner).  I've left this in, but I suppose it's not
strictly necessary anymore.

Tested on x86_64-linux.

PR exp/31258
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31258
tromey pushed a commit that referenced this issue Feb 20, 2024
From the Python API, we can execute GDB commands via gdb.execute.  If
the command gives an exception, however, we need to recover the GDB
prompt and enable stdin, because the exception does not reach
top-level GDB or normal_stop.  This was done in commit

  commit 1ba1ac8
  Author: Andrew Burgess <andrew.burgess@embecosm.com>
  Date:   Tue Nov 19 11:17:20 2019 +0000

    gdb: Enable stdin on exception in execute_gdb_command

with the following code:

  catch (const gdb_exception &except)
    {
      /* If an exception occurred then we won't hit normal_stop (), or have
         an exception reach the top level of the event loop, which are the
         two usual places in which stdin would be re-enabled. So, before we
         convert the exception and continue back in Python, we should
         re-enable stdin here.  */
      async_enable_stdin ();
      GDB_PY_HANDLE_EXCEPTION (except);
    }

In this patch, we explain what happens when we run a GDB command in
the context of a synchronous command, e.g.  via Python observer
notifications.

As an example, suppose we have the following objfile event listener,
specified in a file named file.py:

~~~
import gdb

class MyListener:
    def __init__(self):
        gdb.events.new_objfile.connect(self.handle_new_objfile_event)
        self.processed_objfile = False

    def handle_new_objfile_event(self, event):
        if self.processed_objfile:
            return

        print("loading " + event.new_objfile.filename)
        self.processed_objfile = True
        gdb.execute('add-inferior -no-connection')
        gdb.execute('inferior 2')
        gdb.execute('target remote | gdbserver - /tmp/a.out')
        gdb.execute('inferior 1')

the_listener = MyListener()
~~~

Using this Python file, we see the behavior below:

  $ gdb -q -ex "source file.py" -ex "run" --args a.out
  Reading symbols from a.out...
  Starting program: /tmp/a.out
  loading /lib64/ld-linux-x86-64.so.2
  [New inferior 2]
  Added inferior 2
  [Switching to inferior 2 [<null>] (<noexec>)]
  stdin/stdout redirected
  Process /tmp/a.out created; pid = 3075406
  Remote debugging using stdio
  Reading /tmp/a.out from remote target...
  ...
  [Switching to inferior 1 [process 3075400] (/tmp/a.out)]
  [Switching to thread 1.1 (process 3075400)]
  #0  0x00007ffff7fe3290 in ?? () from /lib64/ld-linux-x86-64.so.2
  (gdb) [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
  [Inferior 1 (process 3075400) exited normally]

Note how the GDB prompt comes in-between the debugger output.  We have this
obscure behavior, because the executed command, "target remote", triggers
an invocation of `normal_stop` that enables stdin.  After that, however,
the Python notification context completes and GDB continues with its normal
flow of executing the 'run' command.  This can be seen in the call stack
below:

  (top-gdb) bt
  #0  async_enable_stdin () at src/gdb/event-top.c:523
  #1  0x00005555561c3acd in normal_stop () at src/gdb/infrun.c:9432
  #2  0x00005555561b328e in start_remote (from_tty=0) at src/gdb/infrun.c:3801
  #3  0x0000555556441224 in remote_target::start_remote_1 (this=0x5555587882e0, from_tty=0, extended_p=0) at src/gdb/remote.c:5225
  #4  0x000055555644166c in remote_target::start_remote (this=0x5555587882e0, from_tty=0, extended_p=0) at src/gdb/remote.c:5316
  #5  0x00005555564430cf in remote_target::open_1 (name=0x55555878525e "| gdbserver - /tmp/a.out", from_tty=0, extended_p=0) at src/gdb/remote.c:6175
  #6  0x0000555556441707 in remote_target::open (name=0x55555878525e "| gdbserver - /tmp/a.out", from_tty=0) at src/gdb/remote.c:5338
  #7  0x00005555565ea63f in open_target (args=0x55555878525e "| gdbserver - /tmp/a.out", from_tty=0, command=0x555558589280)  at src/gdb/target.c:824
  #8  0x0000555555f0d89a in cmd_func (cmd=0x555558589280, args=0x55555878525e "| gdbserver - /tmp/a.out", from_tty=0) at src/gdb/cli/cli-decode.c:2735
  #9  0x000055555661fb42 in execute_command (p=0x55555878529e "t", from_tty=0) at src/gdb/top.c:575
  #10 0x0000555555f1a506 in execute_control_command_1 (cmd=0x555558756f00, from_tty=0) at src/gdb/cli/cli-script.c:529
  #11 0x0000555555f1abea in execute_control_command (cmd=0x555558756f00, from_tty=0) at src/gdb/cli/cli-script.c:701
  #12 0x0000555555f19fc7 in execute_control_commands (cmdlines=0x555558756f00, from_tty=0) at src/gdb/cli/cli-script.c:411
  #13 0x0000555556400d91 in execute_gdb_command (self=0x7ffff43b5d00, args=0x7ffff440ab60, kw=0x0) at src/gdb/python/python.c:700
  #14 0x00007ffff7a96023 in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  #15 0x00007ffff7a4dadc in _PyObject_MakeTpCall () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  #16 0x00007ffff79e9a1c in _PyEval_EvalFrameDefault () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  #17 0x00007ffff7b303af in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  #18 0x00007ffff7a50358 in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  #19 0x00007ffff7a4f3f4 in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  #20 0x00007ffff7a4f883 in PyObject_CallFunctionObjArgs () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  #21 0x00005555563a9758 in evpy_emit_event (event=0x7ffff42b5430, registry=0x7ffff42b4690) at src/gdb/python/py-event.c:104
  #22 0x00005555563cb874 in emit_new_objfile_event (objfile=0x555558761700) at src/gdb/python/py-newobjfileevent.c:52
  #23 0x00005555563b53bc in python_new_objfile (objfile=0x555558761700) at src/gdb/python/py-inferior.c:195
  #24 0x0000555555d6dff0 in std::__invoke_impl<void, void (*&)(objfile*), objfile*> (__f=@0x5555585b5860: 0x5555563b5360 <python_new_objfile(objfile*)>) at /usr/include/c++/11/bits/invoke.h:61
  #25 0x0000555555d6be18 in std::__invoke_r<void, void (*&)(objfile*), objfile*> (__fn=@0x5555585b5860: 0x5555563b5360 <python_new_objfile(objfile*)>) at /usr/include/c++/11/bits/invoke.h:111
  #26 0x0000555555d69661 in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7fffffffd080: 0x555558761700) at /usr/include/c++/11/bits/std_function.h:290
  #27 0x0000555556314caf in std::function<void (objfile*)>::operator()(objfile*) const (this=0x5555585b5860, __args#0=0x555558761700) at /usr/include/c++/11/bits/std_function.h:590
  #28 0x000055555631444e in gdb::observers::observable<objfile*>::notify (this=0x55555836eea0 <gdb::observers::new_objfile>, args#0=0x555558761700) at src/gdb/../gdbsupport/observable.h:166
  #29 0x0000555556599b3f in symbol_file_add_with_addrs (abfd=..., name=0x55555875d310 "/lib64/ld-linux-x86-64.so.2", add_flags=..., addrs=0x7fffffffd2f0, flags=..., parent=0x0) at src/gdb/symfile.c:1125
  #30 0x0000555556599ca4 in symbol_file_add_from_bfd (abfd=..., name=0x55555875d310 "/lib64/ld-linux-x86-64.so.2", add_flags=..., addrs=0x7fffffffd2f0, flags=..., parent=0x0) at src/gdb/symfile.c:1160
  #31 0x0000555556546371 in solib_read_symbols (so=..., flags=...) at src/gdb/solib.c:692
  #32 0x0000555556546f0f in solib_add (pattern=0x0, from_tty=0, readsyms=1) at src/gdb/solib.c:1015
  #33 0x0000555556539891 in enable_break (info=0x55555874e180, from_tty=0) at src/gdb/solib-svr4.c:2416
  #34 0x000055555653b305 in svr4_solib_create_inferior_hook (from_tty=0) at src/gdb/solib-svr4.c:3058
  #35 0x0000555556547cee in solib_create_inferior_hook (from_tty=0) at src/gdb/solib.c:1217
  #36 0x0000555556196f6a in post_create_inferior (from_tty=0) at src/gdb/infcmd.c:275
  #37 0x0000555556197670 in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at src/gdb/infcmd.c:486
  #38 0x000055555619783f in run_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:512
  #39 0x0000555555f0798d in do_simple_func (args=0x0, from_tty=1, c=0x555558567510) at src/gdb/cli/cli-decode.c:95
  #40 0x0000555555f0d89a in cmd_func (cmd=0x555558567510, args=0x0, from_tty=1) at src/gdb/cli/cli-decode.c:2735
  #41 0x000055555661fb42 in execute_command (p=0x7fffffffe2c4 "", from_tty=1) at src/gdb/top.c:575
  #42 0x000055555626303b in catch_command_errors (command=0x55555661f4ab <execute_command(char const*, int)>, arg=0x7fffffffe2c1 "run", from_tty=1, do_bp_actions=true) at src/gdb/main.c:513
  #43 0x000055555626328a in execute_cmdargs (cmdarg_vec=0x7fffffffdaf0, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x7fffffffda3c) at src/gdb/main.c:612
  #44 0x0000555556264849 in captured_main_1 (context=0x7fffffffdd40) at src/gdb/main.c:1293
  #45 0x0000555556264a7f in captured_main (data=0x7fffffffdd40) at src/gdb/main.c:1314
  #46 0x0000555556264b2e in gdb_main (args=0x7fffffffdd40) at src/gdb/main.c:1343
  #47 0x0000555555ceccab in main (argc=9, argv=0x7fffffffde78) at src/gdb/gdb.c:39
  (top-gdb)

The use of the "target remote" command here is just an example.  In
principle, we would reproduce the problem with any command that
triggers an invocation of `normal_stop`.

To omit enabling the stdin in `normal_stop`, we would have to check the
context we are in.  Since we cannot do that, we add a new field to
`struct ui` to track whether the prompt was already blocked, and set
the tracker flag in the Python context before executing a GDB command.

After applying this patch, the output becomes

  ...
  Reading symbols from a.out...
  Starting program: /tmp/a.out
  loading /lib64/ld-linux-x86-64.so.2
  [New inferior 2]
  Added inferior 2
  [Switching to inferior 2 [<null>] (<noexec>)]
  stdin/stdout redirected
  Process /tmp/a.out created; pid = 3032261
  Remote debugging using stdio
  Reading /tmp/a.out from remote target...
  ...
  [Switching to inferior 1 [process 3032255] (/tmp/a.out)]
  [Switching to thread 1.1 (process 3032255)]
  #0  0x00007ffff7fe3290 in ?? () from /lib64/ld-linux-x86-64.so.2
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
  [Inferior 1 (process 3032255) exited normally]
  (gdb)

Let's now consider a secondary scenario, where the command executed from
the Python raises an error.  As an example, suppose we have the Python
file below:

    def handle_new_objfile_event(self, event):
        ...
        print("loading " + event.new_objfile.filename)
        self.processed_objfile = True
        gdb.execute('print a')

The executed command, "print a", gives an error because "a" is not
defined.  Without this patch, we see the behavior below, where the
prompt is again placed incorrectly:

  ...
  Reading symbols from /tmp/a.out...
  Starting program: /tmp/a.out
  loading /lib64/ld-linux-x86-64.so.2
  Python Exception <class 'gdb.error'>: No symbol "a" in current context.
  (gdb) [Inferior 1 (process 3980401) exited normally]

This time, `async_enable_stdin` is called from the 'catch' block in
`execute_gdb_command`:

  (top-gdb) bt
  #0  async_enable_stdin () at src/gdb/event-top.c:523
  #1  0x0000555556400f0a in execute_gdb_command (self=0x7ffff43b5d00, args=0x7ffff440ab60, kw=0x0) at src/gdb/python/python.c:713
  #2  0x00007ffff7a96023 in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  #3  0x00007ffff7a4dadc in _PyObject_MakeTpCall () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  #4  0x00007ffff79e9a1c in _PyEval_EvalFrameDefault () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  #5  0x00007ffff7b303af in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  #6  0x00007ffff7a50358 in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  #7  0x00007ffff7a4f3f4 in ?? () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  #8  0x00007ffff7a4f883 in PyObject_CallFunctionObjArgs () from /lib/x86_64-linux-gnu/libpython3.10.so.1.0
  #9  0x00005555563a9758 in evpy_emit_event (event=0x7ffff42b5430, registry=0x7ffff42b4690) at src/gdb/python/py-event.c:104
  #10 0x00005555563cb874 in emit_new_objfile_event (objfile=0x555558761410) at src/gdb/python/py-newobjfileevent.c:52
  #11 0x00005555563b53bc in python_new_objfile (objfile=0x555558761410) at src/gdb/python/py-inferior.c:195
  #12 0x0000555555d6dff0 in std::__invoke_impl<void, void (*&)(objfile*), objfile*> (__f=@0x5555585b5860: 0x5555563b5360 <python_new_objfile(objfile*)>) at /usr/include/c++/11/bits/invoke.h:61
  #13 0x0000555555d6be18 in std::__invoke_r<void, void (*&)(objfile*), objfile*> (__fn=@0x5555585b5860: 0x5555563b5360 <python_new_objfile(objfile*)>) at /usr/include/c++/11/bits/invoke.h:111
  #14 0x0000555555d69661 in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7fffffffd080: 0x555558761410) at /usr/include/c++/11/bits/std_function.h:290
  #15 0x0000555556314caf in std::function<void (objfile*)>::operator()(objfile*) const (this=0x5555585b5860, __args#0=0x555558761410) at /usr/include/c++/11/bits/std_function.h:590
  #16 0x000055555631444e in gdb::observers::observable<objfile*>::notify (this=0x55555836eea0 <gdb::observers::new_objfile>, args#0=0x555558761410) at src/gdb/../gdbsupport/observable.h:166
  #17 0x0000555556599b3f in symbol_file_add_with_addrs (abfd=..., name=0x55555875d020 "/lib64/ld-linux-x86-64.so.2", add_flags=..., addrs=0x7fffffffd2f0, flags=..., parent=0x0) at src/gdb/symfile.c:1125
  #18 0x0000555556599ca4 in symbol_file_add_from_bfd (abfd=..., name=0x55555875d020 "/lib64/ld-linux-x86-64.so.2", add_flags=..., addrs=0x7fffffffd2f0, flags=..., parent=0x0) at src/gdb/symfile.c:1160
  #19 0x0000555556546371 in solib_read_symbols (so=..., flags=...) at src/gdb/solib.c:692
  #20 0x0000555556546f0f in solib_add (pattern=0x0, from_tty=0, readsyms=1) at src/gdb/solib.c:1015
  #21 0x0000555556539891 in enable_break (info=0x55555874a670, from_tty=0) at src/gdb/solib-svr4.c:2416
  #22 0x000055555653b305 in svr4_solib_create_inferior_hook (from_tty=0) at src/gdb/solib-svr4.c:3058
  #23 0x0000555556547cee in solib_create_inferior_hook (from_tty=0) at src/gdb/solib.c:1217
  #24 0x0000555556196f6a in post_create_inferior (from_tty=0) at src/gdb/infcmd.c:275
  #25 0x0000555556197670 in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at src/gdb/infcmd.c:486
  #26 0x000055555619783f in run_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:512
  #27 0x0000555555f0798d in do_simple_func (args=0x0, from_tty=1, c=0x555558567510) at src/gdb/cli/cli-decode.c:95
  #28 0x0000555555f0d89a in cmd_func (cmd=0x555558567510, args=0x0, from_tty=1) at src/gdb/cli/cli-decode.c:2735
  #29 0x000055555661fb42 in execute_command (p=0x7fffffffe2c4 "", from_tty=1) at src/gdb/top.c:575
  #30 0x000055555626303b in catch_command_errors (command=0x55555661f4ab <execute_command(char const*, int)>, arg=0x7fffffffe2c1 "run", from_tty=1, do_bp_actions=true) at src/gdb/main.c:513
  #31 0x000055555626328a in execute_cmdargs (cmdarg_vec=0x7fffffffdaf0, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x7fffffffda3c) at src/gdb/main.c:612
  #32 0x0000555556264849 in captured_main_1 (context=0x7fffffffdd40) at src/gdb/main.c:1293
  #33 0x0000555556264a7f in captured_main (data=0x7fffffffdd40) at src/gdb/main.c:1314
  #34 0x0000555556264b2e in gdb_main (args=0x7fffffffdd40) at src/gdb/main.c:1343
  #35 0x0000555555ceccab in main (argc=9, argv=0x7fffffffde78) at src/gdb/gdb.c:39
  (top-gdb)

Again, after we enable stdin, GDB continues with its normal flow
of the 'run' command and receives the inferior's exit event, where
it would have enabled stdin, if we had not done it prematurely.

  (top-gdb) bt
  #0  async_enable_stdin () at src/gdb/event-top.c:523
  #1  0x00005555561c3acd in normal_stop () at src/gdb/infrun.c:9432
  #2  0x00005555561b5bf1 in fetch_inferior_event () at src/gdb/infrun.c:4700
  #3  0x000055555618d6a7 in inferior_event_handler (event_type=INF_REG_EVENT) at src/gdb/inf-loop.c:42
  #4  0x000055555620ecdb in handle_target_event (error=0, client_data=0x0) at src/gdb/linux-nat.c:4316
  #5  0x0000555556f33035 in handle_file_event (file_ptr=0x5555587024e0, ready_mask=1) at src/gdbsupport/event-loop.cc:573
  #6  0x0000555556f3362f in gdb_wait_for_event (block=0) at src/gdbsupport/event-loop.cc:694
  #7  0x0000555556f322cd in gdb_do_one_event (mstimeout=-1) at src/gdbsupport/event-loop.cc:217
  #8  0x0000555556262df8 in start_event_loop () at src/gdb/main.c:407
  #9  0x0000555556262f85 in captured_command_loop () at src/gdb/main.c:471
  #10 0x0000555556264a84 in captured_main (data=0x7fffffffdd40) at src/gdb/main.c:1324
  #11 0x0000555556264b2e in gdb_main (args=0x7fffffffdd40) at src/gdb/main.c:1343
  #12 0x0000555555ceccab in main (argc=9, argv=0x7fffffffde78) at src/gdb/gdb.c:39
  (top-gdb)

The solution implemented by this patch addresses the problem.  After
applying the patch, the output becomes

  $ gdb -q -ex "source file.py" -ex "run" --args a.out
  Reading symbols from /tmp/a.out...
  Starting program: /tmp/a.out
  loading /lib64/ld-linux-x86-64.so.2
  Python Exception <class 'gdb.error'>: No symbol "a" in current context.
  [Inferior 1 (process 3984511) exited normally]
  (gdb)

Regression-tested on X86_64 Linux using the default board file (i.e.  unix).

Co-Authored-By: Oguzhan Karakaya <oguzhan.karakaya@intel.com>
Reviewed-By: Guinevere Larsen <blarsen@redhat.com>
Approved-By: Tom Tromey <tom@tromey.com>
tromey pushed a commit that referenced this issue Apr 30, 2024
When running test-case gdb.server/connect-with-no-symbol-file.exp on
aarch64-linux (specifically, an opensuse leap 15.5 container on a
fedora asahi 39 system), I run into:
...
(gdb) detach^M
Detaching from program: target:connect-with-no-symbol-file, process 185104^M
Ending remote debugging.^M
terminate called after throwing an instance of 'gdb_exception_error'^M
...

The detailed backtrace of the corefile is:
...
 (gdb) bt
 #0  0x0000ffff75504f54 in raise () from /lib64/libpthread.so.0
 #1  0x00000000007a86b4 in handle_fatal_signal (sig=6)
     at gdb/event-top.c:926
 #2  <signal handler called>
 #3  0x0000ffff74b977b4 in raise () from /lib64/libc.so.6
 #4  0x0000ffff74b98c18 in abort () from /lib64/libc.so.6
 #5  0x0000ffff74ea26f4 in __gnu_cxx::__verbose_terminate_handler() ()
    from /usr/lib64/libstdc++.so.6
 #6  0x0000ffff74ea011c in ?? () from /usr/lib64/libstdc++.so.6
 #7  0x0000ffff74ea0180 in std::terminate() () from /usr/lib64/libstdc++.so.6
 #8  0x0000ffff74ea0464 in __cxa_throw () from /usr/lib64/libstdc++.so.6
 #9  0x0000000001548870 in throw_it (reason=RETURN_ERROR,
     error=TARGET_CLOSE_ERROR, fmt=0x16c7810 "Remote connection closed", ap=...)
     at gdbsupport/common-exceptions.cc:203
 #10 0x0000000001548920 in throw_verror (error=TARGET_CLOSE_ERROR,
     fmt=0x16c7810 "Remote connection closed", ap=...)
     at gdbsupport/common-exceptions.cc:211
 #11 0x0000000001548a00 in throw_error (error=TARGET_CLOSE_ERROR,
     fmt=0x16c7810 "Remote connection closed")
     at gdbsupport/common-exceptions.cc:226
 #12 0x0000000000ac8f2c in remote_target::readchar (this=0x233d3d90, timeout=2)
     at gdb/remote.c:9856
 #13 0x0000000000ac9f04 in remote_target::getpkt (this=0x233d3d90,
     buf=0x233d40a8, forever=false, is_notif=0x0) at gdb/remote.c:10326
 #14 0x0000000000acf3d0 in remote_target::remote_hostio_send_command
     (this=0x233d3d90, command_bytes=13, which_packet=17,
     remote_errno=0xfffff1a3cf38, attachment=0xfffff1a3ce88,
     attachment_len=0xfffff1a3ce90) at gdb/remote.c:12567
 #15 0x0000000000ad03bc in remote_target::fileio_fstat (this=0x233d3d90, fd=3,
     st=0xfffff1a3d020, remote_errno=0xfffff1a3cf38)
     at gdb/remote.c:12979
 #16 0x0000000000c39878 in target_fileio_fstat (fd=0, sb=0xfffff1a3d020,
     target_errno=0xfffff1a3cf38) at gdb/target.c:3315
 #17 0x00000000007eee5c in target_fileio_stream::stat (this=0x233d4400,
     abfd=0x2323fc40, sb=0xfffff1a3d020) at gdb/gdb_bfd.c:467
 #18 0x00000000007f012c in <lambda(bfd*, void*, stat*)>::operator()(bfd *,
     void *, stat *) const (__closure=0x0, abfd=0x2323fc40, stream=0x233d4400,
     sb=0xfffff1a3d020) at gdb/gdb_bfd.c:955
 #19 0x00000000007f015c in <lambda(bfd*, void*, stat*)>::_FUN(bfd *, void *,
     stat *) () at gdb/gdb_bfd.c:956
 #20 0x0000000000f9b838 in opncls_bstat (abfd=0x2323fc40, sb=0xfffff1a3d020)
     at bfd/opncls.c:665
 #21 0x0000000000f90adc in bfd_stat (abfd=0x2323fc40, statbuf=0xfffff1a3d020)
     at bfd/bfdio.c:431
 #22 0x000000000065fe20 in reopen_exec_file () at gdb/corefile.c:52
 #23 0x0000000000c3a3e8 in generic_mourn_inferior ()
     at gdb/target.c:3642
 #24 0x0000000000abf3f0 in remote_unpush_target (target=0x233d3d90)
     at gdb/remote.c:6067
 #25 0x0000000000aca8b0 in remote_target::mourn_inferior (this=0x233d3d90)
     at gdb/remote.c:10587
 #26 0x0000000000c387cc in target_mourn_inferior (
     ptid=<error reading variable: Cannot access memory at address 0x2d310>)
     at gdb/target.c:2738
 #27 0x0000000000abfff0 in remote_target::remote_detach_1 (this=0x233d3d90,
     inf=0x22fce540, from_tty=1) at gdb/remote.c:6421
 #28 0x0000000000ac0094 in remote_target::detach (this=0x233d3d90,
     inf=0x22fce540, from_tty=1) at gdb/remote.c:6436
 #29 0x0000000000c37c3c in target_detach (inf=0x22fce540, from_tty=1)
     at gdb/target.c:2526
 #30 0x0000000000860424 in detach_command (args=0x0, from_tty=1)
    at gdb/infcmd.c:2817
 #31 0x000000000060b594 in do_simple_func (args=0x0, from_tty=1, c=0x231431a0)
     at gdb/cli/cli-decode.c:94
 #32 0x00000000006108c8 in cmd_func (cmd=0x231431a0, args=0x0, from_tty=1)
     at gdb/cli/cli-decode.c:2741
 #33 0x0000000000c65a94 in execute_command (p=0x232e52f6 "", from_tty=1)
     at gdb/top.c:570
 #34 0x00000000007a7d2c in command_handler (command=0x232e52f0 "")
     at gdb/event-top.c:566
 #35 0x00000000007a8290 in command_line_handler (rl=...)
     at gdb/event-top.c:802
 #36 0x0000000000c9092c in tui_command_line_handler (rl=...)
     at gdb/tui/tui-interp.c:103
 #37 0x00000000007a750c in gdb_rl_callback_handler (rl=0x23385330 "detach")
     at gdb/event-top.c:258
 #38 0x0000000000d910f4 in rl_callback_read_char ()
     at readline/readline/callback.c:290
 #39 0x00000000007a7338 in gdb_rl_callback_read_char_wrapper_noexcept ()
     at gdb/event-top.c:194
 #40 0x00000000007a73f0 in gdb_rl_callback_read_char_wrapper
     (client_data=0x22fbf640) at gdb/event-top.c:233
 #41 0x0000000000cbee1c in stdin_event_handler (error=0, client_data=0x22fbf640)
     at gdb/ui.c:154
 #42 0x000000000154ed60 in handle_file_event (file_ptr=0x232be730, ready_mask=1)
     at gdbsupport/event-loop.cc:572
 #43 0x000000000154f21c in gdb_wait_for_event (block=1)
     at gdbsupport/event-loop.cc:693
 #44 0x000000000154dec4 in gdb_do_one_event (mstimeout=-1)
    at gdbsupport/event-loop.cc:263
 #45 0x0000000000910f98 in start_event_loop () at gdb/main.c:400
 #46 0x0000000000911130 in captured_command_loop () at gdb/main.c:464
 #47 0x0000000000912b5c in captured_main (data=0xfffff1a3db58)
     at gdb/main.c:1338
 #48 0x0000000000912bf4 in gdb_main (args=0xfffff1a3db58)
     at gdb/main.c:1357
 #49 0x00000000004170f4 in main (argc=10, argv=0xfffff1a3dcc8)
     at gdb/gdb.c:38
 (gdb)
...

The abort happens because a c++ exception escapes to c code, specifically
opncls_bstat in bfd/opncls.c.  Compiling with -fexceptions works around this.

Fix this by catching the exception just before it escapes, in stat_trampoline
and likewise in few similar spot.

Add a new template catch_exceptions to do so in a consistent way.

Tested on aarch64-linux.

Approved-by: Pedro Alves <pedro@palves.net>

PR remote/31577
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31577
tromey pushed a commit that referenced this issue Sep 13, 2024
The binary provided with bug 32165 [1] has 36139 ELF sections.  GDB
crashes on it with (note that my GDB is build with -D_GLIBCXX_DEBUG=1:

    $ ./gdb  -nx -q --data-directory=data-directory ./vmlinux
    Reading symbols from ./vmlinux...
    (No debugging symbols found in ./vmlinux)
    (gdb) info func
    /usr/include/c++/14.2.1/debug/vector:508:
    In function:
        std::debug::vector<_Tp, _Allocator>::reference std::debug::vector<_Tp,
        _Allocator>::operator[](size_type) [with _Tp = long unsigned int;
        _Allocator = std::allocator<long unsigned int>; reference = long
        unsigned int&; size_type = long unsigned int]

    Error: attempt to subscript container with out-of-bounds index -29445, but
    container only holds 36110 elements.

    Objects involved in the operation:
        sequence "this" @ 0x514000007340 {
          type = std::debug::vector<unsigned long, std::allocator<unsigned long> >;
        }

The crash occurs here:

    #3  0x00007ffff5e334c3 in __GI_abort () at abort.c:79
    #4  0x00007ffff689afc4 in __gnu_debug::_Error_formatter::_M_error (this=<optimized out>) at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/debug.cc:1320
    #5  0x0000555561119a16 in std::__debug::vector<unsigned long, std::allocator<unsigned long> >::operator[] (this=0x514000007340, __n=18446744073709522171)
        at /usr/include/c++/14.2.1/debug/vector:508
    #6  0x0000555562e288e8 in minimal_symbol::value_address (this=0x5190000bb698, objfile=0x514000007240) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:517
    #7  0x0000555562e5a131 in global_symbol_searcher::expand_symtabs (this=0x7ffff0f5c340, objfile=0x514000007240, preg=std::optional [no contained value])
        at /home/smarchi/src/binutils-gdb/gdb/symtab.c:4983
    #8  0x0000555562e5d2ed in global_symbol_searcher::search (this=0x7ffff0f5c340) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5189
    #9  0x0000555562e5ffa4 in symtab_symbol_info (quiet=false, exclude_minsyms=false, regexp=0x0, kind=FUNCTION_DOMAIN, t_regexp=0x0, from_tty=1)
        at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5361
    #10 0x0000555562e6131b in info_functions_command (args=0x0, from_tty=1) at /home/smarchi/src/binutils-gdb/gdb/symtab.c:5525

That is, at this line of `minimal_symbol::value_address`, where
`objfile->section_offsets` is an `std::vector`:

    return (CORE_ADDR (this->unrelocated_address ())
	    + objfile->section_offsets[this->section_index ()]);

A section index of -29445 is suspicious.  The minimal_symbol at play
here is:

    (top-gdb) p m_name
    $1 = 0x521001de10af "_sinittext"

So I restarted debugging, breaking on:

   (top-gdb) b general_symbol_info::set_section_index if $_streq("_sinittext", m_name)

And I see that weird -29445 value:

    (top-gdb) frame
    #0  general_symbol_info::set_section_index (this=0x525000082390, idx=-29445) at /home/smarchi/src/binutils-gdb/gdb/symtab.h:611
    611       { m_section = idx; }

But going up one frame, the section index is 36091:

    (top-gdb) frame
    #1  0x0000555562426526 in minimal_symbol_reader::record_full (this=0x7ffff0ead560, name="_sinittext", copy_name=false,
        address=-2111475712, ms_type=mst_text, section=36091) at /home/smarchi/src/binutils-gdb/gdb/minsyms.c:1228
    1228      msymbol->set_section_index (section);

It seems like the problem is just that the type used for the section
index (short) is not big enough.  Change from short to int.  If somebody
insists, we could even go long long / int64_t, but I doubt it's
necessary.

With that fixed, I get:

    (gdb) info func
    All defined functions:

    Non-debugging symbols:
    0xffffffff81000000  _stext
    0xffffffff82257000  _sinittext
    0xffffffff822b4ebb  _einittext

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=32165

Change-Id: Icb1c3de9474ff5adef7e0bbbf5e0b67b279dee04
Reviewed-By: Tom de Vries <tdevries@suse.de>
Reviewed-by: Keith Seitz <keiths@redhat.com>
tromey pushed a commit that referenced this issue Oct 19, 2024
When building gdb with gcc 12 and -fsanitize=threads while renabling
background dwarf reading by setting dwarf_synchronous to false, I run into:
...
(gdb) file amd64-watchpoint-downgrade
Reading symbols from amd64-watchpoint-downgrade...
(gdb) watch global_var
==================
WARNING: ThreadSanitizer: data race (pid=20124)
  Read of size 8 at 0x7b80000500d8 by main thread:
    #0 cooked_index_entry::full_name(obstack*, bool) const cooked-index.c:220
    #1 cooked_index::get_main_name(obstack*, language*) const cooked-index.c:735
    #2 cooked_index_worker::wait(cooked_state, bool) cooked-index.c:559
    #3 cooked_index::wait(cooked_state, bool) cooked-index.c:631
    #4 cooked_index_functions::wait(objfile*, bool) cooked-index.h:729
    #5 cooked_index_functions::compute_main_name(objfile*) cooked-index.h:806
    #6 objfile::compute_main_name() symfile-debug.c:461
    #7 find_main_name symtab.c:6503
    #8 main_language() symtab.c:6608
    #9 set_initial_language_callback symfile.c:1634
    #10 get_current_language() language.c:96
    ...

  Previous write of size 8 at 0x7b80000500d8 by thread T1:
    #0 cooked_index_shard::finalize(parent_map_map const*) \
         dwarf2/cooked-index.c:409
    #1 operator() cooked-index.c:663
    ...

  ...

SUMMARY: ThreadSanitizer: data race cooked-index.c:220 in \
  cooked_index_entry::full_name(obstack*, bool) const
==================
Hardware watchpoint 1: global_var
(gdb) PASS: gdb.arch/amd64-watchpoint-downgrade.exp: watch global_var
...

This was also reported in PR31715.

This is due do gcc PR110799 [1], generating wrong code with
-fhoist-adjacent-loads, and causing a false positive for
-fsanitize=threads.

Work around the gcc PR by forcing -fno-hoist-adjacent-loads for gcc <= 13
and -fsanitize=threads.

Tested in that same configuration on x86_64-linux.  Remaining ThreadSanitizer
problems are the ones reported in PR31626 (gdb.rust/dwindex.exp) and
PR32247 (gdb.trace/basic-libipa.exp).

PR gdb/31715
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31715

Tested-By: Bernd Edlinger <bernd.edlinger@hotmail.de>
Approved-By: Tom Tromey <tom@tromey.com>

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110799
tromey pushed a commit that referenced this issue Nov 1, 2024
On Windows gcore is not implemented, and if you try it, you get an
heap-use-after-free error:

(gdb) gcore C:/gdb/build64/gdb-git-python3/gdb/testsuite/outputs/gdb.base/gcore-buffer-overflow/gcore-buffer-overflow.test
warning: cannot close "=================================================================
==10108==ERROR: AddressSanitizer: heap-use-after-free on address 0x1259ea503110 at pc 0x7ff6806e3936 bp 0x0062e01ed990 sp 0x0062e01ed140
READ of size 111 at 0x1259ea503110 thread T0
    #0 0x7ff6806e3935 in strlen C:/gcc/src/gcc-14.2.0/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:391
    #1 0x7ff6807169c4 in __pformat_puts C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_pformat.c:558
    #2 0x7ff6807186c1 in __mingw_pformat C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_pformat.c:2514
    #3 0x7ff680713614 in __mingw_vsnprintf C:/gcc/src/mingw-w64-v12.0.0/mingw-w64-crt/stdio/mingw_vsnprintf.c:41
    #4 0x7ff67f34419f in vsnprintf(char*, unsigned long long, char const*, char*) C:/msys64/mingw64/x86_64-w64-mingw32/include/stdio.h:484
    #5 0x7ff67f34419f in string_vprintf[abi:cxx11](char const*, char*) C:/gdb/src/gdb.git/gdbsupport/common-utils.cc:106
    #6 0x7ff67b37b739 in cli_ui_out::do_message(ui_file_style const&, char const*, char*) C:/gdb/src/gdb.git/gdb/cli-out.c:227
    #7 0x7ff67ce3d030 in ui_out::call_do_message(ui_file_style const&, char const*, ...) C:/gdb/src/gdb.git/gdb/ui-out.c:571
    #8 0x7ff67ce4255a in ui_out::vmessage(ui_file_style const&, char const*, char*) C:/gdb/src/gdb.git/gdb/ui-out.c:740
    #9 0x7ff67ce2c873 in ui_file::vprintf(char const*, char*) C:/gdb/src/gdb.git/gdb/ui-file.c:73
    #10 0x7ff67ce7f83d in gdb_vprintf(ui_file*, char const*, char*) C:/gdb/src/gdb.git/gdb/utils.c:1881
    #11 0x7ff67ce7f83d in vwarning(char const*, char*) C:/gdb/src/gdb.git/gdb/utils.c:181
    #12 0x7ff67f3530eb in warning(char const*, ...) C:/gdb/src/gdb.git/gdbsupport/errors.cc:33
    #13 0x7ff67baed27f in gdb_bfd_close_warning C:/gdb/src/gdb.git/gdb/gdb_bfd.c:437
    #14 0x7ff67baed27f in gdb_bfd_close_or_warn C:/gdb/src/gdb.git/gdb/gdb_bfd.c:646
    #15 0x7ff67baed27f in gdb_bfd_unref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.c:739
    #16 0x7ff68094b6f2 in gdb_bfd_ref_policy::decref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.h:82
    #17 0x7ff68094b6f2 in gdb::ref_ptr<bfd, gdb_bfd_ref_policy>::~ref_ptr() C:/gdb/src/gdb.git/gdbsupport/gdb_ref_ptr.h:91
    #18 0x7ff67badf4d2 in gcore_command C:/gdb/src/gdb.git/gdb/gcore.c:176

0x1259ea503110 is located 16 bytes inside of 4064-byte region [0x1259ea503100,0x1259ea5040e0)
freed by thread T0 here:
    #0 0x7ff6806b1687 in free C:/gcc/src/gcc-14.2.0/libsanitizer/asan/asan_malloc_win.cpp:90
    #1 0x7ff67f2ae807 in objalloc_free C:/gdb/src/gdb.git/libiberty/objalloc.c:187
    #2 0x7ff67d7f56e3 in _bfd_free_cached_info C:/gdb/src/gdb.git/bfd/opncls.c:247
    #3 0x7ff67d7f2782 in _bfd_delete_bfd C:/gdb/src/gdb.git/bfd/opncls.c:180
    #4 0x7ff67d7f5df9 in bfd_close_all_done C:/gdb/src/gdb.git/bfd/opncls.c:960
    #5 0x7ff67d7f62ec in bfd_close C:/gdb/src/gdb.git/bfd/opncls.c:925
    #6 0x7ff67baecd27 in gdb_bfd_close_or_warn C:/gdb/src/gdb.git/gdb/gdb_bfd.c:643
    #7 0x7ff67baecd27 in gdb_bfd_unref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.c:739
    #8 0x7ff68094b6f2 in gdb_bfd_ref_policy::decref(bfd*) C:/gdb/src/gdb.git/gdb/gdb_bfd.h:82
    #9 0x7ff68094b6f2 in gdb::ref_ptr<bfd, gdb_bfd_ref_policy>::~ref_ptr() C:/gdb/src/gdb.git/gdbsupport/gdb_ref_ptr.h:91
    #10 0x7ff67badf4d2 in gcore_command C:/gdb/src/gdb.git/gdb/gcore.c:176

It happens because gdb_bfd_close_or_warn uses a bfd-internal name for
the failing-close warning, after the close is finished, and the name
already freed:

static int
gdb_bfd_close_or_warn (struct bfd *abfd)
{
  int ret;
  const char *name = bfd_get_filename (abfd);

  for (asection *sect : gdb_bfd_sections (abfd))
    free_one_bfd_section (sect);

  ret = bfd_close (abfd);

  if (!ret)
    gdb_bfd_close_warning (name,
			   bfd_errmsg (bfd_get_error ()));

  return ret;
}

Fixed by making a copy of the name for the warning.

Approved-By: Andrew Burgess <aburgess@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant