Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shift/reduce conflicts in the grammar #1

Open
pipcet opened this issue Mar 5, 2015 · 1 comment
Open

Shift/reduce conflicts in the grammar #1

pipcet opened this issue Mar 5, 2015 · 1 comment
Assignees
Labels

Comments

@pipcet
Copy link
Owner

pipcet commented Mar 5, 2015

The C grammar in c-exp.y needs to be fixed not to have shift/reduce conflicts. The easiest way would be to introduce a second typeof-like keyword to create typeonly values.

@pipcet pipcet added the bug label Mar 5, 2015
@pipcet
Copy link
Owner Author

pipcet commented Mar 5, 2015

The commit that introduced the problem was 17897e4

pipcet pushed a commit that referenced this issue Apr 11, 2015
…gs.exp

Hi,
I see the following two fails in gdb.base/savedregs.exp on aarch64-linux,

info frame 2^M
Stack frame at 0x7ffffffa60:^M
 pc = 0x40085c in thrower (/home/yao/SourceCode/gnu/gdb/git/gdb/testsuite/gdb.base/savedregs.c:49); saved pc = 0x400898^M
 called by frame at 0x7ffffffa70, caller of frame at 0x7fffffe800^M
 source language c.^M
 Arglist at 0x7ffffffa60, args: ^M
 Locals at 0x7ffffffa60, Previous frame's sp is 0x7ffffffa60^M
(gdb) FAIL: gdb.base/savedregs.exp: Get thrower info frame

info frame 2^M
Stack frame at 0x7fffffe800:^M
 pc = 0x400840 in catcher (/home/yao/SourceCode/gnu/gdb/git/gdb/testsuite/gdb.base/savedregs.c:42); saved pc = 0x7fb7ffc350^M
 called by frame at 0x7fffffe800, caller of frame at 0x7fffffe7e0^M
 source language c.^M
 Arglist at 0x7fffffe7f0, args: sig=11^M
 Locals at 0x7fffffe7f0, Previous frame's sp is 0x7fffffe800
(gdb) FAIL: gdb.base/savedregs.exp: Get catcher info frame

looks the test expects to match "Saved registers:" from the output of
"info frame", but no registers are saved on these two frames, because
thrower and catcher are simple and leaf functions.

(gdb) disassemble thrower
Dump of assembler code for function thrower:
   0x0000000000400858 <+0>:	mov	x0, #0x0                   	// #0
   0x000000000040085c <+4>:	strb	wzr, [x0]
   0x0000000000400860 <+8>:	ret
End of assembler dump.
(gdb) disassemble catcher
Dump of assembler code for function catcher:
   0x0000000000400838 <+0>:	sub	sp, sp, #0x10
   0x000000000040083c <+4>:	str	w0, [sp,bminor#12]
   0x0000000000400840 <+8>:	adrp	x0, 0x410000
   0x0000000000400844 <+12>:	add	x0, x0, #0xb9c
   0x0000000000400848 <+16>:	mov	w1, #0x1                   	// #1
   0x000000000040084c <+20>:	str	w1, [x0]
   0x0000000000400850 <+24>:	add	sp, sp, #0x10
   0x0000000000400854 <+28>:	ret

There are two ways to fix these fails, one is to modify functions to
force some registers saved (for example, doing function call in them),
and the other one is to relax the pattern to optionally match
"Saved registers:".  I did both, and feel that the latter is simple,
so here is it.

gdb/testsuite:

2015-03-26  Yao Qi  <yao.qi@linaro.org>

	* gdb.base/savedregs.exp (process_saved_regs): Make
	"Saved registers:" optional in the pattern.
pipcet pushed a commit that referenced this issue Apr 11, 2015
On GNU/Linux, if the target reuses the TID of a thread that GDB still
has in its list marked as THREAD_EXITED, GDB crashes, like:

 (gdb) continue
 Continuing.
 src/gdb/thread.c:789: internal-error: set_running: Assertion `tp->state != THREAD_EXITED' failed.
 A problem internal to GDB has been detected,
 further debugging may prove unreliable.
 Quit this debugging session? (y or n) FAIL: gdb.threads/tid-reuse.exp: continue to breakpoint: after_reuse_time (GDB internal error)

Here:

 (top-gdb) bt
 #0  internal_error (file=0x953dd8 "src/gdb/thread.c", line=789, fmt=0x953da0 "%s: Assertion `%s' failed.")
     at src/gdb/common/errors.c:54
 #1  0x0000000000638514 in set_running (ptid=..., running=1) at src/gdb/thread.c:789
 #2  0x00000000004bda42 in linux_handle_extended_wait (lp=0x16f5760, status=0, stopping=0) at src/gdb/linux-nat.c:2114
 #3  0x00000000004bfa24 in linux_nat_filter_event (lwpid=20570, status=198015) at src/gdb/linux-nat.c:3127
 #4  0x00000000004c070e in linux_nat_wait_1 (ops=0xe193d0, ptid=..., ourstatus=0x7fffffffd2c0, target_options=1) at src/gdb/linux-nat.c:3478
 #5  0x00000000004c1015 in linux_nat_wait (ops=0xe193d0, ptid=..., ourstatus=0x7fffffffd2c0, target_options=1) at src/gdb/linux-nat.c:3722
 #6  0x00000000004c92d2 in thread_db_wait (ops=0xd80b60 <thread_db_ops>, ptid=..., ourstatus=0x7fffffffd2c0, options=1)
     at src/gdb/linux-thread-db.c:1525
 #7  0x000000000066db43 in delegate_wait (self=0xd80b60 <thread_db_ops>, arg1=..., arg2=0x7fffffffd2c0, arg3=1) at src/gdb/target-delegates.c:116
 bminor#8  0x000000000067e54b in target_wait (ptid=..., status=0x7fffffffd2c0, options=1) at src/gdb/target.c:2206
 bminor#9  0x0000000000625111 in fetch_inferior_event (client_data=0x0) at src/gdb/infrun.c:3275
 bminor#10 0x0000000000648a3b in inferior_event_handler (event_type=INF_REG_EVENT, client_data=0x0) at src/gdb/inf-loop.c:56
 bminor#11 0x00000000004c2ecb in handle_target_event (error=0, client_data=0x0) at src/gdb/linux-nat.c:4655

I managed to come up with a test that reliably reproduces this.  It
spawns enough threads for the pid number space to wrap around, so
could potentially take a while.  On my box that's 4 seconds; on
gcc110, a PPC box which has max_pid set to 65536, it's over 10
seconds.  So I made the test compute how long that would take, and cap
the time waited if it would be unreasonably long.

Tested on x86_64 Fedora 20.

gdb/ChangeLog:
2015-04-01  Pedro Alves  <palves@redhat.com>

	* linux-thread-db.c (record_thread): Readd the thread to gdb's
	list if it was marked exited.

gdb/testsuite/ChangeLog:
2015-04-01  Pedro Alves  <palves@redhat.com>

	* gdb.threads/tid-reuse.c: New file.
	* gdb.threads/tid-reuse.exp: New file.
pipcet pushed a commit that referenced this issue Apr 11, 2015
I see these two fails in no-unwaited-for-left.exp in remote testing
for aarch64-linux target.

...
continue
Continuing.
warning: Remote failure reply: E.No unwaited-for children left.

[Thread 1084] #2 stopped.
(gdb) FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when thread 2 exits

....
continue
Continuing.
warning: Remote failure reply: E.No unwaited-for children left.

[Thread 1081] #1 stopped.
(gdb) FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits

I checked the gdb.log on buildbot, and find that these two fails also
appear on Debian-i686-native-extended-gdbserver and Fedora-ppc64be-native-gdbserver-m64.
I recall that they are about local/remote parity, and related RSP is missing.
There has been already a PR 14618 about it.  This patch is to kfail them
on remote target.

gdb/testsuite:

2015-04-02  Yao Qi  <yao.qi@linaro.org>

	* gdb.threads/no-unwaited-for-left.exp: Set up kfail if target
	is remote.
pipcet pushed a commit that referenced this issue Apr 11, 2015
…gnals

Both PRs are triggered by the same use case.

PR18214 is about software single-step targets.  On those, the 'resume'
code that detects that we're stepping over a breakpoint and delivering
a signal at the same time:

  /* Currently, our software single-step implementation leads to different
     results than hardware single-stepping in one situation: when stepping
     into delivering a signal which has an associated signal handler,
     hardware single-step will stop at the first instruction of the handler,
     while software single-step will simply skip execution of the handler.
...
     Fortunately, we can at least fix this particular issue.  We detect
     here the case where we are about to deliver a signal while software
     single-stepping with breakpoints removed.  In this situation, we
     revert the decisions to remove all breakpoints and insert single-
     step breakpoints, and instead we install a step-resume breakpoint
     at the current address, deliver the signal without stepping, and
     once we arrive back at the step-resume breakpoint, actually step
     over the breakpoint we originally wanted to step over.  */

doesn't handle the case of _another_ thread also needing to step over
a breakpoint.  Because the other thread is just resumed at the PC
where it had stopped and a breakpoint is still inserted there, the
thread immediately re-traps the same breakpoint.  This test exercises
that.  On software single-step targets, it fails like this:

 KFAIL: gdb.threads/multiple-step-overs.exp: displaced=off: signal thr3: continue to sigusr1_handler
 KFAIL: gdb.threads/multiple-step-overs.exp: displaced=off: signal thr2: continue to sigusr1_handler

gdb.log (simplified):

 (gdb) continue
 Continuing.

 Breakpoint 4, child_function_2 (arg=0x0) at src/gdb/testsuite/gdb.threads/multiple-step-overs.c:66
 66            callme (); /* set breakpoint thread 2 here */
 (gdb) thread 3
 (gdb) queue-signal SIGUSR1
 (gdb) thread 1
 [Switching to thread 1 (Thread 0x7ffff7fc1740 (LWP 24824))]
 #0  main () at src/gdb/testsuite/gdb.threads/multiple-step-overs.c:106
 106       wait_threads (); /* set wait-threads breakpoint here */
 (gdb) break sigusr1_handler
 Breakpoint 5 at 0x400837: file src/gdb/testsuite/gdb.threads/multiple-step-overs.c, line 31.
 (gdb) continue
 Continuing.
 [Switching to Thread 0x7ffff7fc0700 (LWP 24828)]

 Breakpoint 4, child_function_2 (arg=0x0) at src/gdb/testsuite/gdb.threads/multiple-step-overs.c:66
 66            callme (); /* set breakpoint thread 2 here */
 (gdb) KFAIL: gdb.threads/multiple-step-overs.exp: displaced=off: signal thr3: continue to sigusr1_handler


For good measure, I made the test try displaced stepping too.  And
then I found it crashes GDB on x86-64 (a hardware step target), but
only when displaced stepping... :

 KFAIL: gdb.threads/multiple-step-overs.exp: displaced=on: signal thr1: continue to sigusr1_handler (PRMS: gdb/18216)
 KFAIL: gdb.threads/multiple-step-overs.exp: displaced=on: signal thr2: continue to sigusr1_handler (PRMS: gdb/18216)
 KFAIL: gdb.threads/multiple-step-overs.exp: displaced=on: signal thr3: continue to sigusr1_handler (PRMS: gdb/18216)

 Program terminated with signal SIGSEGV, Segmentation fault.
 #0  0x000000000062a83a in process_event_stop_test (ecs=0x7fff847eeee0) at src/gdb/infrun.c:4964
 4964          if (sr_bp->loc->permanent
 Setting up the environment for debugging gdb.
 Breakpoint 1 at 0x79fcfc: file src/gdb/common/errors.c, line 54.
 Breakpoint 2 at 0x50a26c: file src/gdb/cli/cli-cmds.c, line 217.
 (top-gdb) p sr_bp
 $1 = (struct breakpoint *) 0x0
 (top-gdb) bt
 #0  0x000000000062a83a in process_event_stop_test (ecs=0x7fff847eeee0) at src/gdb/infrun.c:4964
 #1  0x000000000062a1af in handle_signal_stop (ecs=0x7fff847eeee0) at src/gdb/infrun.c:4715
 #2  0x0000000000629097 in handle_inferior_event (ecs=0x7fff847eeee0) at src/gdb/infrun.c:4165
 #3  0x0000000000627482 in fetch_inferior_event (client_data=0x0) at src/gdb/infrun.c:3298
 #4  0x000000000064ad7b in inferior_event_handler (event_type=INF_REG_EVENT, client_data=0x0) at src/gdb/inf-loop.c:56
 #5  0x00000000004c375f in handle_target_event (error=0, client_data=0x0) at src/gdb/linux-nat.c:4658
 #6  0x0000000000648c47 in handle_file_event (file_ptr=0x2e0eaa0, ready_mask=1) at src/gdb/event-loop.c:658

The all-stop-non-stop series fixes this, but meanwhile, this augments
the multiple-step-overs.exp test to cover this, KFAILed.

gdb/testsuite/ChangeLog:
2015-04-08  Pedro Alves  <palves@redhat.com>

	PR gdb/18214
	PR gdb/18216
	* gdb.threads/multiple-step-overs.c (sigusr1_handler): New
	function.
	(main): Install it as SIGUSR1 handler.
	* gdb.threads/multiple-step-overs.exp (setup): Remove 'prefix'
	parameter.  Always use "setup" as prefix.  Toggle "set
	displaced-stepping" off/on depending on global.  Don't switch to
	thread 1 here.
	(top level): Add displaced stepping "off/on" test axis.  Update
	"setup" calls.  Wrap each subtest with with_test_prefix.  Test
	continuing with a queued signal in each thread.
pipcet pushed a commit that referenced this issue Apr 11, 2015
…tep targets

TL;DR:

When stepping over a breakpoint with displaced stepping, the core must
be notified of all signals, otherwise the displaced step fixup code
confuses a breakpoint trap in the signal handler for the expected trap
indicating the displaced instruction was single-stepped
normally/successfully.

Detailed version:

Running sigstep.exp with displaced stepping on, against my x86
software single-step branch, I got:

 FAIL: gdb.base/sigstep.exp: step on breakpoint, to handler: performing step
 FAIL: gdb.base/sigstep.exp: next on breakpoint, to handler: performing next
 FAIL: gdb.base/sigstep.exp: continue on breakpoint, to handler: performing continue

Turning on debug logs, we see:

 (gdb) step
 infrun: clear_proceed_status_thread (process 32147)
 infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
 infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [process 32147] at 0x400842
 displaced: stepping process 32147 now
 displaced: saved 0x400622: 49 89 d1 5e 48 89 e2 48 83 e4 f0 50 54 49 c7 c0
 displaced: %rip-relative addressing used.
 displaced: using temp reg 2, old value 0x3615eafd37, new value 0x40084c
 displaced: copy 0x400842->0x400622: c7 81 1c 08 20 00 00 00 00 00
 displaced: displaced pc to 0x400622
 displaced: run 0x400622: c7 81 1c 08
 LLR: Preparing to resume process 32147, 0, inferior_ptid process 32147
 LLR: PTRACE_CONT process 32147, 0 (resume event thread)
 linux_nat_wait: [process -1], [TARGET_WNOHANG]
 LLW: enter
 LNW: waitpid(-1, ...) returned 32147, No child processes
 LLW: waitpid 32147 received Alarm clock (stopped)
 LLW: PTRACE_CONT process 32147, Alarm clock (preempt 'handle')
 LNW: waitpid(-1, ...) returned 0, No child processes
 LLW: exit (ignore)
 sigchld
 infrun: target_wait (-1.0.0, status) =
 infrun:   -1.0.0 [process -1],
 infrun:   status->kind = ignore
 infrun: TARGET_WAITKIND_IGNORE
 infrun: prepare_to_wait
 linux_nat_wait: [process -1], [TARGET_WNOHANG]
 LLW: enter
 LNW: waitpid(-1, ...) returned 32147, No child processes
 LLW: waitpid 32147 received Trace/breakpoint trap (stopped)
 CSBB: process 32147 stopped by software breakpoint
 LNW: waitpid(-1, ...) returned 0, No child processes
 LLW: trap ptid is process 32147.
 LLW: exit
 infrun: target_wait (-1.0.0, status) =
 infrun:   32147.32147.0 [process 32147],
 infrun:   status->kind = stopped, signal = GDB_SIGNAL_TRAP
 infrun: TARGET_WAITKIND_STOPPED
 displaced: restored process 32147 0x400622
 displaced: fixup (0x400842, 0x400622), insn = 0xc7 0x81 ...
 displaced: restoring reg 2 to 0x3615eafd37
 displaced: relocated %rip from 0x400717 to 0x400937
 infrun: stop_pc = 0x400937
 infrun: delayed software breakpoint trap, ignoring
 infrun: no line number info
 infrun: stop_waiting
 0x0000000000400937 in __dso_handle ()
 1: x/i $pc
 => 0x400937:    and    %ah,0xa0d64(%rip)        # 0x4a16a1
 (gdb) FAIL: gdb.base/sigstep.exp: displaced=on: step on breakpoint, to handler: performing step


What should have happened is that the breakpoint hit in the signal
handler should have been presented to the user.  But note that
"preempt 'handle'" -- what happened instead is that
displaced_step_fixup confused the breakpoint in the signal handler for
the expected SIGTRAP indicating the displaced instruction was
single-stepped normally/successfully.

This should be affecting all software single-step targets in the same
way.

The fix is to make sure the core sees all signals when displaced
stepping, just like we already must see all signals when doing an
stepping over a breakpoint in-line.  We now get:

 infrun: target_wait (-1.0.0, status) =
 infrun:   570.570.0 [process 570],
 infrun:   status->kind = stopped, signal = GDB_SIGNAL_ALRM
 infrun: TARGET_WAITKIND_STOPPED
 displaced: restored process 570 0x400622
 infrun: stop_pc = 0x400842
 infrun: random signal (GDB_SIGNAL_ALRM)
 infrun: signal arrived while stepping over breakpoint
 infrun: inserting step-resume breakpoint at 0x400842
 infrun: resume (step=0, signal=GDB_SIGNAL_ALRM), trap_expected=0, current thread [process 570] at 0x400842
 LLR: Preparing to resume process 570, Alarm clock, inferior_ptid process 570
 LLR: PTRACE_CONT process 570, Alarm clock (resume event thread)
 infrun: prepare_to_wait
 linux_nat_wait: [process -1], [TARGET_WNOHANG]
 LLW: enter
 LNW: waitpid(-1, ...) returned 0, No child processes
 LLW: exit (ignore)
 infrun: target_wait (-1.0.0, status) =
 infrun:   -1.0.0 [process -1],
 infrun:   status->kind = ignore
 sigchld
 infrun: TARGET_WAITKIND_IGNORE
 infrun: prepare_to_wait
 linux_nat_wait: [process -1], [TARGET_WNOHANG]
 LLW: enter
 LNW: waitpid(-1, ...) returned 570, No child processes
 LLW: waitpid 570 received Trace/breakpoint trap (stopped)
 CSBB: process 570 stopped by software breakpoint
 LNW: waitpid(-1, ...) returned 0, No child processes
 LLW: trap ptid is process 570.
 LLW: exit
 infrun: target_wait (-1.0.0, status) =
 infrun:   570.570.0 [process 570],
 infrun:   status->kind = stopped, signal = GDB_SIGNAL_TRAP
 infrun: TARGET_WAITKIND_STOPPED
 infrun: stop_pc = 0x400717
 infrun: BPSTAT_WHAT_STOP_NOISY
 infrun: stop_waiting

 Breakpoint 3, handler (sig=14) at /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/sigstep.c:35
 35        done = 1;

Hardware single-step targets already behave this way, because the
Linux backends (both native and gdbserver) always report signals to
the core if the thread was single-stepping.

As mentioned in the new comment in do_target_resume, we can't fix this
by instead making the displaced_step_fixup phase skip fixing up the PC
if the single step stopped somewhere we didn't expect.  Here's what
the backtrace would look like if we did that:

 Breakpoint 3, handler (sig=14) at /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/sigstep.c:35
 35        done = 1;
 1: x/i $pc
 => 0x400717 <handler+7>:        movl   $0x1,0x200943(%rip)        # 0x601064 <done>
 (gdb) bt
 #0  handler (sig=14) at /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.base/sigstep.c:35
 #1  <signal handler called>
 #2  0x0000000000400622 in _start ()
 (gdb) FAIL: gdb.base/sigstep.exp: displaced=on: step on breakpoint, to handler: backtrace

gdb/ChangeLog:
2015-04-10  Pedro Alves  <palves@redhat.com>

	* infrun.c (displaced_step_in_progress): New function.
	(do_target_resume): Advise target to report all signals if
	displaced stepping.

gdb/testsuite/ChangeLog:
2015-04-10  Pedro Alves  <palves@redhat.com>

	* gdb.base/sigstep.exp (breakpoint_to_handler)
	(breakpoint_to_handler_entry): New parameter 'displaced'.  Use it.
	Test "backtrace" in handler.
	(breakpoint_over_handler): New parameter 'displaced'.  Use it.
	(top level): Add new "displaced" test axis to
	breakpoint_to_handler, breakpoint_to_handler_entry and
	breakpoint_over_handler.
pipcet pushed a commit that referenced this issue May 6, 2016
I see the following test fail in arm-linux with -marm and -fomit-frame-pointer,

 step
 callee () at /home/yao/SourceCode/gnu/gdb/git/gdb/testsuite/gdb.reverse/step-reverse.c:27
 27      }                       /* RETURN FROM CALLEE */
 (gdb) step
 main () at /home/yao/SourceCode/gnu/gdb/git/gdb/testsuite/gdb.reverse/step-reverse.c:58
 58         callee();    /* STEP INTO THIS CALL */
 (gdb) FAIL: gdb.reverse/step-precsave.exp: reverse step into fn call

As we can see, the "step" has already stepped into the function callee,
but in the last line.  The second "step" attempts to step to function
body, but it goes out of callee, which isn't expected.

The program is compiled with -marm and -fomit-frame-pointer, the
function callee is prologue-less, because nothing needs to be saved
on stack,

(gdb) disassemble callee
Dump of assembler code for function callee:
   0x00010680 <+0>:	movw	r3, #2364	; 0x93c
   0x00010684 <+4>:	movt	r3, #2
   0x00010688 <+8>:	ldr	r3, [r3]
   0x0001068c <+12>:	add	r2, r3, #1
   0x00010690 <+16>:	movw	r3, #2364	; 0x93c
   0x00010694 <+20>:	movt	r3, #2
   0x00010698 <+24>:	str	r2, [r3]
   0x0001069c <+28>:	mov	r3, #0
   0x000106a0 <+32>:	mov	r0, r3
   0x000106a4 <+36>:	bx	lr

program stops at the 0x106a0 (passed the epilogue) after the first
"step".  When second "step" is executed, the stepping range is
[0x10680-0x106a0], which starts from the first instruction of function
callee (because it doesn't have prologue).

infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [LWP 2461] at 0x1069c^M
infrun: prepare_to_wait^M
infrun: target_wait (-1.0.0, status) =^M
infrun:   2461.2461.0 [LWP 2461],^M
infrun:   status->kind = stopped, signal = GDB_SIGNAL_TRAP^M
infrun: TARGET_WAITKIND_STOPPED^M
infrun: stop_pc = 0x10698^M
infrun: stepping inside range [0x10680-0x106a0]

When program goes out of the range, it stops at the caller of callee,
and test fails.  IOW, if function callee has prologue, the stepping
range won't start from the first instruction of the function, and
program stops at the prologue and test passes.

IMO, GDB does nothing wrong, but test shouldn't expect the program
stops in callee after the second "step".  I decide to fix test rather
than GDB.  In this patch, I change to test to do one "step", and check
the program is still in callee, then, do multiple "step" until program
goes out of the callee.

gdb/testsuite:

2016-04-22  Yao Qi  <yao.qi@linaro.org>

	* gdb.reverse/step-precsave.exp: Do one step and test program
	stops in "callee" and do multiple steps until program goes out
	of "callee".
	* gdb.reverse/step-reverse.exp: Likewise.
pipcet pushed a commit that referenced this issue May 6, 2016
Nowadays, read_memory may throw NOT_AVAILABLE_ERROR (it is done by
patch http://sourceware.org/ml/gdb-patches/2013-08/msg00625.html)
however, read_stack and read_code still throws MEMORY_ERROR only.  This
causes PR 19947, that is prologue unwinder is unable unwind because
code memory isn't available, but MEMORY_ERROR is thrown, while unwinder
catches NOT_AVAILABLE_ERROR.

 #0  memory_error (err=err@entry=TARGET_XFER_E_IO, memaddr=memaddr@entry=140737349781158) at /home/yao/SourceCode/gnu/gdb/git/gdb/corefile.c:217
 #1  0x000000000065f5ba in read_code (memaddr=memaddr@entry=140737349781158, myaddr=myaddr@entry=0x7fffffffd7b0 "\340\023<\001", len=len@entry=1)
     at /home/yao/SourceCode/gnu/gdb/git/gdb/corefile.c:288
 #2  0x000000000065f7b5 in read_code_unsigned_integer (memaddr=memaddr@entry=140737349781158, len=len@entry=1, byte_order=byte_order@entry=BFD_ENDIAN_LITTLE)
     at /home/yao/SourceCode/gnu/gdb/git/gdb/corefile.c:363
 #3  0x00000000004717e0 in amd64_analyze_prologue (gdbarch=gdbarch@entry=0x13c13e0, pc=140737349781158, current_pc=140737349781165, cache=cache@entry=0xda0cb0)
     at /home/yao/SourceCode/gnu/gdb/git/gdb/amd64-tdep.c:2267
 #4  0x0000000000471f6d in amd64_frame_cache_1 (cache=0xda0cb0, this_frame=0xda0bf0) at /home/yao/SourceCode/gnu/gdb/git/gdb/amd64-tdep.c:2437
 #5  amd64_frame_cache (this_frame=0xda0bf0, this_cache=<optimised out>) at /home/yao/SourceCode/gnu/gdb/git/gdb/amd64-tdep.c:2508
 #6  0x000000000047214d in amd64_frame_this_id (this_frame=<optimised out>, this_cache=<optimised out>, this_id=0xda0c50)
     at /home/yao/SourceCode/gnu/gdb/git/gdb/amd64-tdep.c:2541
 #7  0x00000000006b94c4 in compute_frame_id (fi=0xda0bf0) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:481
 bminor#8  get_prev_frame_if_no_cycle (this_frame=this_frame@entry=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1809
 bminor#9  0x00000000006bb6c9 in get_prev_frame_always_1 (this_frame=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1983
 bminor#10 get_prev_frame_always (this_frame=this_frame@entry=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1999
 bminor#11 0x00000000006bbe11 in get_prev_frame (this_frame=this_frame@entry=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:2241
 bminor#12 0x00000000006bc13c in unwind_to_current_frame (ui_out=<optimised out>, args=args@entry=0xda0b20) at /home/yao/SourceCode/gnu/gdb/git/gdb/frame.c:1485

The fix is to let read_stack and read_code throw NOT_AVAILABLE_ERROR too,
in order to align with read_memory.

gdb:

2016-05-04  Yao Qi  <yao.qi@linaro.org>

	PR gdb/19947
	* corefile.c (read_memory): Rename it to ...
	(read_memory_object): ... it.  Add parameter object.
	(read_memory): Call read_memory_object.
	(read_stack): Likewise.
	(read_code): Likewise.
pipcet pushed a commit that referenced this issue Jun 9, 2016
Nowadays, GDB can't insert breakpoint on the return address of the
exception handler on ARM M-profile, because the address is a magic
one 0xfffffff9,

 (gdb) bt
 #0  CT32B1_IRQHandler () at ../src/timer.c:67
 #1  <signal handler called>
 #2  main () at ../src/timer.c:127

(gdb) info frame
Stack level 0, frame at 0x200ffa8:
 pc = 0x4ec in CT32B1_IRQHandler (../src/timer.c:67); saved pc = 0xfffffff9
 called by frame at 0x200ffc8
 source language c.
 Arglist at 0x200ffa0, args:
 Locals at 0x200ffa0, Previous frame's sp is 0x200ffa8
 Saved registers:
  r7 at 0x200ffa0, lr at 0x200ffa4

(gdb) x/x 0xfffffff9
0xfffffff9:     Cannot access memory at address 0xfffffff9

(gdb) finish
Run till exit from #0  CT32B1_IRQHandler () at ../src/timer.c:67
Ed:15: Target error from Set break/watch: Et:96: Pseudo-address (0xFFFFFFxx) for EXC_RETURN is invalid (GDB error?)

Warning:
Cannot insert hardware breakpoint 0.
Could not insert hardware breakpoints:
You may have requested too many hardware breakpoints/watchpoints.

Command aborted.

even some debug probe can't set hardware breakpoint on the magic
address too,

(gdb) hbreak *0xfffffff9
Hardware assisted breakpoint 2 at 0xfffffff9
(gdb) c
Continuing.
Ed:15: Target error from Set break/watch: Et:96: Pseudo-address (0xFFFFFFxx) for EXC_RETURN is invalid (GDB error?)

Warning:
Cannot insert hardware breakpoint 2.
Could not insert hardware breakpoints:
You may have requested too many hardware breakpoints/watchpoints.

Command aborted.

The problem described above is quite similar to PR 8841, in which GDB
can't set breakpoint on signal trampoline, which is mapped to a read-only
page by kernel.  The rationale of this patch is to skip "unwritable"
frames when looking for caller frames in command "finish", and a new
gdbarch method code_of_frame_writable is added.  This patch fixes
the problem on ARM cortex-m target, but it can be used to fix
PR 8841 too.

gdb:

2016-05-10  Yao Qi  <yao.qi@arm.com>

	* arch-utils.c (default_code_of_frame_writable): New function.
	* arch-utils.h (default_code_of_frame_writable): Declare.
	* arm-tdep.c (arm_code_of_frame_writable): New function.
	(arm_gdbarch_init): Install gdbarch method
	code_of_frame_writable if the target is M-profile.
	* frame.c (skip_unwritable_frames): New function.
	* frame.h (skip_unwritable_frames): Declare.
	* gdbarch.sh (code_of_frame_writable): New.
	* gdbarch.c, gdbarch.h: Re-generated.
	* infcmd.c (finish_command): Call skip_unwritable_frames.
pipcet pushed a commit that referenced this issue Jun 9, 2016
As reported in PR 19998, after type ctrl-c, GDB hang there and does
not send interrupt.  It causes a fail in gdb.base/interrupt.exp.
All targets support remote fileio should be affected.

When we type ctrc-c, SIGINT is handled by remote_fileio_sig_set,
as shown below,

 #0  remote_fileio_sig_set (sigint_func=0x4495d0 <remote_fileio_ctrl_c_signal_handler(int)>) at /home/yao/SourceCode/gnu/gdb/git/gdb/remote-fileio.c:325
 #1  0x00000000004495de in remote_fileio_ctrl_c_signal_handler (signo=<optimised out>) at /home/yao/SourceCode/gnu/gdb/git/gdb/remote-fileio.c:349
 #2  <signal handler called>
 #3  0x00007ffff647ed83 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81
 #4  0x00000000005530ce in interruptible_select (n=10, readfds=readfds@entry=0x7fffffffd730, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0,
    timeout=timeout@entry=0x0) at /home/yao/SourceCode/gnu/gdb/git/gdb/event-top.c:1017
 #5  0x000000000061ab20 in stdio_file_read (file=<optimised out>, buf=0x12d02e0 "\n\022-\001", length_buf=16383)
    at /home/yao/SourceCode/gnu/gdb/git/gdb/ui-file.c:577
 #6  0x000000000044a4dc in remote_fileio_func_read (buf=0x12c0360 "") at /home/yao/SourceCode/gnu/gdb/git/gdb/remote-fileio.c:583
 #7  0x0000000000449598 in do_remote_fileio_request (uiout=<optimised out>, buf_arg=buf_arg@entry=0x12c0340)
    at /home/yao/SourceCode/gnu/gdb/git/gdb/remote-fileio.c:1179

we don't set quit_serial_event,

  do
    {
      res = gdb_select (n, readfds, writefds, exceptfds, timeout);
    }
  while (res == -1 && errno == EINTR);

  if (res == 1 && FD_ISSET (fd, readfds))
    {
      errno = EINTR;
      return -1;
    }
  return res;

we can't go out of the loop above, and that is why GDB can't send
interrupt.

Recently, we stop throwing exception from SIGINT handler
(remote_fileio_ctrl_c_signal_handler)
https://sourceware.org/ml/gdb-patches/2016-03/msg00372.html, which
is correct, because gdb_select is interruptible.  However, in the
same patch series, we add interruptible_select later as a wrapper
to gdb_select, https://sourceware.org/ml/gdb-patches/2016-03/msg00375.html
and it is not interruptible (because of the loop in it) unless
select/poll-able file descriptors are marked.

This fix in this patch is to call quit_serial_event_set, so that we can
go out of the loop above, return -1 and set errno to EINTR.

2016-06-01  Yao Qi  <yao.qi@linaro.org>

	PR remote/19998
	* remote-fileio.c (remote_fileio_ctrl_c_signal_handler): Call
	quit_serial_event_set.
pipcet pushed a commit that referenced this issue Jul 30, 2016
This change adds support for specifying a negative repeat count to
all the formats of the 'x' command to examine memory backward.
A new testcase 'examine-backward' is added to cover this new feature.

Here's the example output from the new feature:

<format 'i'>
(gdb) bt
#0  Func1 (n=42, p=0x40432e "hogehoge") at main.cpp:5
#1  0x00000000004041fa in main (argc=1, argv=0x7fffffffdff8) at main.cpp:19
(gdb) x/-4i 0x4041fa
  0x4041e5 <main(int, char**)+11>: mov   %rsi,-0x10(%rbp)
  0x4041e9 <main(int, char**)+15>: lea   0x13e(%rip),%rsi
  0x4041f0 <main(int, char**)+22>: mov   $0x2a,%edi
  0x4041f5 <main(int, char**)+27>: callq 0x404147

<format 'x'>
(gdb) x/-4xw 0x404200
0x4041f0 <main(int, char**)+22>: 0x00002abf 0xff4de800 0x76e8ffff 0xb8ffffff
(gdb) x/-4
0x4041e0 <main(int, char**)+6>:  0x7d8910ec 0x758948fc 0x358d48f0 0x0000013e

gdb/ChangeLog:

	* NEWS: Mention that GDB now supports a negative repeat count in
	the 'x' command.
	* printcmd.c (decode_format): Allow '-' in the parameter
	"string_ptr" to accept a negative repeat count.
	(find_instruction_backward): New function.
	(read_memory_backward): New function.
	(integer_is_zero): New function.
	(find_string_backward): New function.
	(do_examine): Use new functions to examine memory backward.
	(_initialize_printcmd): Mention that 'x' command supports a negative
	repeat count.

gdb/doc/ChangeLog:

	* gdb.texinfo (Examining Memory): Document negative repeat
	count in the 'x' command.

gdb/testsuite/ChangeLog:

	* gdb.base/examine-backward.c: New file.
	* gdb.base/examine-backward.exp: New file.
pipcet pushed a commit that referenced this issue Nov 14, 2016
… out value

With something like:

  struct A { int bitfield:4; } var;

If 'var' ends up wholly-optimized out, printing 'var.bitfield' crashes
gdb here:

 (top-gdb) bt
 #0  0x000000000058b89f in extract_unsigned_integer (addr=0x2 <error: Cannot access memory at address 0x2>, len=2, byte_order=BFD_ENDIAN_LITTLE)
     at /home/pedro/gdb/mygit/src/gdb/findvar.c:109
 #1  0x00000000005a187a in unpack_bits_as_long (field_type=0x16cff70, valaddr=0x0, bitpos=16, bitsize=12) at /home/pedro/gdb/mygit/src/gdb/value.c:3347
 #2  0x00000000005a1b9d in unpack_value_bitfield (dest_val=0x1b5d9d0, bitpos=16, bitsize=12, valaddr=0x0, embedded_offset=0, val=0x1b5d8d0)
     at /home/pedro/gdb/mygit/src/gdb/value.c:3441
 #3  0x00000000005a2a5f in value_fetch_lazy (val=0x1b5d9d0) at /home/pedro/gdb/mygit/src/gdb/value.c:3958
 #4  0x00000000005a10a7 in value_primitive_field (arg1=0x1b5d8d0, offset=0, fieldno=0, arg_type=0x16d04c0) at /home/pedro/gdb/mygit/src/gdb/value.c:3161
 #5  0x00000000005b01e5 in do_search_struct_field (name=0x1727c60 "bitfield", arg1=0x1b5d8d0, offset=0, type=0x16d04c0, looking_for_baseclass=0, result_ptr=0x7fffffffcaf8,
 [...]

unpack_value_bitfield is already optimized-out/unavailable -aware:

   (...) VALADDR points to the contents of VAL.  If the VAL's contents
   required to extract the bitfield from are unavailable/optimized
   out, DEST_VAL is correspondingly marked unavailable/optimized out.

however, it is not considering the case of the value having no
contents buffer at all, as can happen through
allocate_optimized_out_value.

gdb/ChangeLog:
2016-08-09  Pedro Alves  <palves@redhat.com>

	* value.c (unpack_value_bitfield): Skip unpacking if the parent
	has no contents buffer to begin with.

gdb/testsuite/ChangeLog:
2016-08-09  Pedro Alves  <palves@redhat.com>

	* gdb.dwarf2/bitfield-parent-optimized-out.exp: New file.
pipcet pushed a commit that referenced this issue Nov 14, 2016
I build GDB with -fsanitize=address, and see the error in tests,

(gdb) PASS: gdb.linespec/ls-errs.exp: lang=C++: break 3 foo
break -line 3 foo^M
=================================================================^M
==4401==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000047487 at pc 0x819d8e bp 0x7fff4e4e6bb0 sp 0x7fff4e4e6ba8^M
READ of size 1 at 0x603000047487 thread T0^[[1m^[[0m^M
    #0 0x819d8d in explicit_location_lex_one /home/yao/SourceCode/gnu/gdb/git/gdb/location.c:502^M
    #1 0x81a185 in string_to_explicit_location(char const**, language_defn const*, int) /home/yao/SourceCode/gnu/gdb/git/gdb/location.c:556^M
    #2 0x81ac10 in string_to_event_location(char**, language_defn const*) /home/yao/SourceCode/gnu/gdb/git/gdb/location.c:687^

the code in question is:

>         /* Special case: C++ operator,.  */
>         if (language->la_language == language_cplus
>             && strncmp (*inp, "operator", 8)  <--- [1]
>             && (*inp)[9] == ',')
>           (*inp) += 9;
>         ++(*inp);

The error is caused by the access to (*inp)[9] if 9 is out of its bounds.
However [1] looks odd to me, because if strncmp returns true (non-zero),
the following check "(*inp)[9] == ','" makes no sense any more.  I
suspect it was a typo in the code we meant to "strncmp () == 0".  Another
problem in the code above is that if *inp is "operator,", we first
increment *inp by 9, and then increment it by one again, which is wrong
to me.  We should only increment *inp by 8 to skip "operator", and go
back to the loop header to decide where we stop.

gdb:

2016-08-15  Yao Qi  <yao.qi@linaro.org>

	* location.c (explicit_location_lex_one): Compare the return
	value of strncmp with zero.  Don't check (*inp)[9].  Increment
	*inp by 8.
pipcet pushed a commit that referenced this issue Nov 14, 2016
If I build gdb with -fsanitize=address and run tests, I get error,

malformed linespec error: unexpected colon^M
(gdb) PASS: gdb.linespec/ls-errs.exp: lang=C: break     :
break   :=================================================================^M
==3266==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000051451 at pc 0x2b5797a972a8 bp 0x7fffd8e0f3c0 sp 0x7fffd8e0f398^M
READ of size 2 at 0x602000051451 thread T0
    #0 0x2b5797a972a7 in __interceptor_strlen (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x322a7)^M
    #1 0x7bd004 in compare_filenames_for_search(char const*, char const*) /home/yao/SourceCode/gnu/gdb/git/gdb/symtab.c:316^M
    #2 0x7bd310 in iterate_over_some_symtabs(char const*, char const*, int (*)(symtab*, void*), void*, compunit_symtab*, compunit_symtab*) /home/yao/SourceCode/gnu/gdb/git/gdb/symtab.c:411^M
    #3 0x7bd775 in iterate_over_symtabs(char const*, int (*)(symtab*, void*), void*) /home/yao/SourceCode/gnu/gdb/git/gdb/symtab.c:481^M
    #4 0x7bda15 in lookup_symtab(char const*) /home/yao/SourceCode/gnu/gdb/git/gdb/symtab.c:527^M
    #5 0x7d5e2a in make_file_symbol_completion_list_1 /home/yao/SourceCode/gnu/gdb/git/gdb/symtab.c:5635^M
    #6 0x7d61e1 in make_file_symbol_completion_list(char const*, char const*, char const*) /home/yao/SourceCode/gnu/gdb/git/gdb/symtab.c:5684^M
    #7 0x88dc06 in linespec_location_completer /home/yao/SourceCode/gnu/gdb/git/gdb/completer.c:288
....
0x602000051451 is located 0 bytes to the right of 1-byte region [0x602000051450,0x602000051451)^M
mallocated by thread T0 here:
    #0 0x2b5797ab97ef in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x547ef)^M
    #1 0xbbfb8d in xmalloc /home/yao/SourceCode/gnu/gdb/git/gdb/common/common-utils.c:43^M
    #2 0x88dabd in linespec_location_completer /home/yao/SourceCode/gnu/gdb/git/gdb/completer.c:273^M
    #3 0x88e5ef in location_completer(cmd_list_element*, char const*, char const*) /home/yao/SourceCode/gnu/gdb/git/gdb/completer.c:531^M
    #4 0x8902e7 in complete_line_internal /home/yao/SourceCode/gnu/gdb/git/gdb/completer.c:964^

The code in question is here

       file_to_match = (char *) xmalloc (colon - text + 1);
       strncpy (file_to_match, text, colon - text + 1);

it is likely that file_to_match is not null-terminated.  The patch is
to strncpy 'colon - text' bytes and explicitly set '\0'.

gdb:

2016-08-19  Yao Qi  <yao.qi@linaro.org>

	* completer.c (linespec_location_completer): Make file_to_match
	null-terminated.
pipcet pushed a commit that referenced this issue Nov 14, 2016
This patch fixes a problem that problem triggers if you start an
inferior, e.g., with the "start" command, in a UI created with the
new-ui command, and then run a foreground execution command in the
main UI.  Once the program stops for the latter command, typing in the
main UI no longer echoes back to the user.

The problem revolves around this:

- gdb_has_a_terminal computes its result lazily, on first call.

  that is what saves gdb's initial main UI terminal state (the UI
  associated with stdin):

          our_terminal_info.ttystate = serial_get_tty_state (stdin_serial);

  This is the state that target_terminal_ours() restores.

- In this scenario, the gdb_has_a_terminal function happens to be
  first ever called from within the target_terminal_init call in
  startup_inferior:

      (top-gdb) bt
      #0  gdb_has_a_terminal () at src/gdb/inflow.c:157
      #1  0x000000000079db22 in child_terminal_init_with_pgrp () at src/gdb/inflow.c:217
       [...]
      #4  0x000000000065bacb in target_terminal_init () at src/gdb/target.c:456
      #5  0x00000000004676d2 in startup_inferior () at src/gdb/fork-child.c:531
       [...]
      #7  0x000000000046b168 in linux_nat_create_inferior () at src/gdb/linux-nat.c:1112
       [...]
      bminor#9  0x00000000005f20c9 in start_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:657

If the command to start the inferior is issued on the main UI, then
readline will have deprepped the terminal when we reach the above, and
the problem doesn't appear.

If however the command is issued on a non-main UI, then when we reach
that gdb_has_a_terminal call, the main UI's terminal state is still
set to whatever readline has sets it to in rl_prep_terminal, which
happens to have echo disabled.  Later, when the following synchronous
execution command finishes, we'll call target_terminal_ours to restore
gdb's the main UI's terminal settings, and that restores the terminal
state with echo disabled...

Conceptually, the fix is to move the gdb_has_a_terminal call earlier,
to someplace during GDB initialization, before readline/ncurses have
had a chance to change terminal settings.  Turns out that
"set_initial_gdb_ttystate" is exactly such a place.

I say conceptually, because the fix actually inlines the
gdb_has_a_terminal part that saves the terminal state in
set_initial_gdb_ttystate and then simplifies gdb_has_a_terminal, since
there's no point in making gdb_has_a_terminal do lazy computation.

gdb/ChangeLog:
2016-08-23  Pedro Alves  <palves@redhat.com>

	PR gdb/20494
	* inflow.c (our_terminal_info, initial_gdb_ttystate): Update
	comments.
	(enum gdb_has_a_terminal_flag_enum, gdb_has_a_terminal_flag):
	Delete.
	(set_initial_gdb_ttystate): Record our_terminal_info here too,
	instead of ...
	(gdb_has_a_terminal): ... here.  Reimplement in terms of
	initial_gdb_ttystate.  Make static.
	* terminal.h (gdb_has_a_terminal): Delete declaration.
	(set_initial_gdb_ttystate): Add comment.
	* top.c (show_interactive_mode): Use input_interactive_p instead
	of gdb_has_a_terminal.

gdb/testsuite/ChangeLog:
2016-08-23  Pedro Alves  <palves@redhat.com>

	PR gdb/20494
	* gdb.base/new-ui-echo.c: New file.
	* gdb.base/new-ui-echo.exp: New file.
pipcet pushed a commit that referenced this issue Nov 14, 2016
This test case verifies that GDB will not attempt to invoke a python
unwinder recursively.

At the moment, the behavior exhibited by GDB looks like this:

    (gdb) source py-recurse-unwind.py
    Python script imported
    (gdb) b ccc
    Breakpoint 1 at 0x4004bd: file py-recurse-unwind.c, line 23.
    (gdb) run
    Starting program: py-recurse-unwind
    TestUnwinder: Recursion detected - returning early.
    TestUnwinder: Recursion detected - returning early.
    TestUnwinder: Recursion detected - returning early.
    TestUnwinder: Recursion detected - returning early.

    Breakpoint 1, ccc (arg=<unavailable>) at py-recurse-unwind.c:23
    23      }
    (gdb) bt
    #-1 ccc (arg=<unavailable>) at py-recurse-unwind.c:23
    Backtrace stopped: previous frame identical to this frame (corrupt stack?)

[I've shortened pathnames for easier reading.]

The desired / expected behavior looks like this:

    (gdb) source py-recurse-unwind.py
    Python script imported
    (gdb) b ccc
    Breakpoint 1 at 0x4004bd: file py-recurse-unwind.c, line 23.
    (gdb) run
    Starting program: py-recurse-unwind

    Breakpoint 1, ccc (arg=789) at py-recurse-unwind.c:23
    23      }
    (gdb) bt
    #0  ccc (arg=789) at py-recurse-unwind.c:23
    #1  0x00000000004004d5 in bbb (arg=456) at py-recurse-unwind.c:28
    #2  0x00000000004004ed in aaa (arg=123) at py-recurse-unwind.c:34
    #3  0x00000000004004fe in main () at py-recurse-unwind.c:40

Note that GDB's problems go well beyond the fact that it invokes the
unwinder recursively.  In the process it messes up some internal state
(the frame stash) leading to display of (only) the sentinel frame in
the backtrace.

gdb/testsuite/ChangeLog:

	* gdb.python/py-recurse-unwind.c: New file.
	* gdb.python/py-recurse-unwind.py: New file.
	* gdb.python/py-recurse-unwind.exp: New file.
pipcet pushed a commit that referenced this issue Nov 14, 2016
…_eval

This fixes the problem exercised by Kevin's test at:

 https://sourceware.org/ml/gdb-patches/2016-08/msg00216.html

This was originally exposed by the OpenJDK Python-based unwinder.

If an unwinder attempts to call parse_and_eval from within its
sniffing method, GDB's unwinding machinery enters infinite recursion.
However, parse_and_eval is a pretty reasonable thing to call, because
Python/Scheme-based unwinders will often need to read globals out of
inferior memory.  The recursion happens because:

- get_current_frame() is called soon after the target stops.

- current_frame is NULL, and so we unwind it from the sentinel frame
  (which is special and has level == -1).

- We reach get_prev_frame_if_no_cycle, which does cycle detection
  based on frame id, and thus tries to compute the frame id of the new
  frame.

- Frame id computation requires an unwinder, so we go through all
  unwinder sniffers trying to see if one accepts the new frame (the
  current frame).

- the unwinder's sniffer calls parse_and_eval().

- parse_and_eval depends on the selected frame/block, and if not set
  yet, the selected frame is set to the current frame.

- get_current_frame () is called again.  current_frame is still NULL,
  so ...

- recurse forever.


In Kevin's test at:

 https://sourceware.org/ml/gdb-patches/2016-08/msg00216.html

gdb doesn't recurse forever simply because the Python unwinder
contains code to detect and stop the recursion itself.  However, GDB
goes downhill from here, e.g., by showing the sentinel frame as
current frame (note the -1):

    Breakpoint 1, ccc (arg=<unavailable>) at py-recurse-unwind.c:23
    23      }
    (gdb) bt
    #-1 ccc (arg=<unavailable>) at py-recurse-unwind.c:23
    Backtrace stopped: previous frame identical to this frame (corrupt stack?)

That "-1" frame level comes from this:

      if (catch_exceptions (current_uiout, unwind_to_current_frame,
			    sentinel_frame, RETURN_MASK_ERROR) != 0)
	{
	  /* Oops! Fake a current frame?  Is this useful?  It has a PC
             of zero, for instance.  */
	  current_frame = sentinel_frame;
	}

which is bogus.  It's never correct to set the current frame to the
sentinel frame.  The only reason this has survived so long is that
getting here normally indicates something wrong has already happened
before and we fix that.  And this case is no exception -- it doesn't
really matter how precisely we managed to get to that bogus code (it
has to do with the the stash), because anything after recursion
happens is going to be invalid.

So the fix is to avoid the recursion in the first place.

Observations:

 #1 - The recursion happens because we try to do cycle detection from
      within get_prev_frame_if_no_cycle.  That requires computing the
      frame id of the frame being unwound, and that itself requires
      calling into the unwinders.

 #2 - But, the first time we're unwinding from the sentinel frame,
      when we reach get_prev_frame_if_no_cycle, there's no frame chain
      at all yet:

      - current_frame is NULL.
      - the frame stash is empty.

Thus, there's really no need to do cycle detection the first time we
reach get_prev_frame_if_no_cycle, when building the current frame.

So we can break the recursion by making get_current_frame call a
simplified version of get_prev_frame_if_no_cycle that results in
setting the current_frame global _before_ computing the current
frame's id.

But, we can go a little bit further.  As there's really no reason
anymore to compute the current frame's frame id immediately, we can
defer computing it to when some caller of get_current_frame might need
it.  This was actually how the frame id was computed for all frames
before the stash-based cycle detection was added.  So in a way, this
patch reintroduces the lazy frame id computation, but unlike before,
only for the case of the current frame, which turns out to be special.

This lazyness, however, requires adjusting
gdb.python/py-unwind-maint.exp, because that assumes unwinders are
immediately called as side effect of some commands.  I didn't see a
need to preserve the behavior expected by that test (all it would take
is call get_frame_id inside get_current_frame), so I adjusted the
test.

gdb/ChangeLog:
2016-09-05  Pedro Alves  <palves@redhat.com>

	PR backtrace/19927
	* frame.c (get_frame_id): Compute the frame id if not computed
	yet.
	(unwind_to_current_frame): Delete.
	(get_current_frame): Use get_prev_frame_always_1 to get the
	current frame and assert that that always succeeds.
	(get_prev_frame_if_no_cycle): Skip cycle detection if returning
	the current frame.

gdb/testsuite/ChangeLog:
2016-09-05  Pedro Alves  <palves@redhat.com>

	PR backtrace/19927
	* gdb.python/py-unwind-maint.exp: Adjust tests to not expect that
	unwinders are immediately called as side effect of "source" or
	"disable unwinder" commands.
	* gdb.python/py-recurse-unwind.exp: Remove setup_kfail calls.
pipcet pushed a commit that referenced this issue Nov 14, 2016
aarch64_reg_parse_32_64 is currently used to parse address registers,
among other things.  It returns two bits of information about the
register: whether it's W rather than X, and whether it's a zero register.

SVE adds addressing modes in which the base or offset can be a vector
register instead of a scalar, so a choice between W and X is no longer
enough.  It's more convenient to pass the type of register around as
a qualifier instead.

As it happens, two callers of aarch64_reg_parse_32_64 already wanted
the information in the form of a qualifier, so the change feels pretty
natural even without SVE.

Also, the function took two parameters to control whether {W}SP
and (W|X)ZR should be accepted.  We tend to get slightly better
error messages by accepting them regardless and getting the caller
to do the check, rather than potentially treating "xzr", "sp" etc.
as constants.  This is easier to do if the function returns the
reg_entry rather than just the register number.

This does create a corner case where:

	.equ	sp, 1
	ldr	w0, [x0, sp]

was previously an acceptable way of writing "ldr w0, [x0, #1]",
but I don't think it's important to continue supporting that.
We already rejected things like:

	.equ	sp, 1
	add	x0, x1, sp

To ensure these new error messages "win" when matching against
several candidate instruction entries, we need to use the same
address-parsing code for all addresses, including ADDR_SIMPLE
and SIMD_ADDR_SIMPLE.  The next patch also relies on this.

Finally, aarcch64_check_reg_type was written in a pretty
conservative way.  It should always be equivalent to a single
bit test.

gas/
	* config/tc-aarch64.c (REG_TYPE_R_Z, REG_TYPE_R_SP): New register
	types.
	(get_reg_expected_msg): Handle them and REG_TYPE_R64_SP.
	(aarch64_check_reg_type): Simplify.
	(aarch64_reg_parse_32_64): Return the reg_entry instead of the
	register number.  Return the type as a qualifier rather than an
	"isreg32" boolean.  Remove reject_sp, reject_rz and isregzero
	parameters.
	(parse_shifter_operand): Update call to aarch64_parse_32_64_reg.
	Use get_reg_expected_msg.
	(parse_address_main): Likewise.  Use aarch64_check_reg_type.
	(po_int_reg_or_fail): Replace reject_sp and reject_rz parameters
	with a reg_type parameter.  Update call to aarch64_parse_32_64_reg.
	Use aarch64_check_reg_type to test the result.
	(parse_operands): Update after the above changes.  Parse ADDR_SIMPLE
	addresses normally before enforcing the syntax restrictions.
	* testsuite/gas/aarch64/diagnostic.s: Add tests for a post-index
	zero register and for a stack pointer index.
	* testsuite/gas/aarch64/diagnostic.l: Update accordingly.
	Also update existing diagnostic messages after the above changes.
	* testsuite/gas/aarch64/illegal-lse.l: Update the error message
	for 32-bit register bases.
pipcet pushed a commit that referenced this issue Nov 14, 2016
In the review of the original version of this series, Richard didn't
like the use of boolean parameters to parse_address_main.  I think we
can just get rid of them and leave the callers to check the addressing
modes.  As it happens, the handling of ADDR_SIMM9{,_2} already did this
for relocation operators (i.e. it used parse_address_reloc and then
rejected relocations).

The callers are already set up to reject invalid register post-indexed
addressing, so we can simply remove the accept_reg_post_index parameter
without adding any more checks.  This again creates a corner case where:

	.equ	x2, 1
	ldr	w0, [x1], x2

was previously an acceptable way of writing "ldr w0, [x1], #1" but
is now rejected.

Removing the "reloc" parameter means that two cases need to check
explicitly for relocation operators.

ADDR_SIMM9_2 appers to be unused.  I'll send a separate patch
to remove it.

This patch makes parse_address temporarily equivalent to
parse_address_main, but later patches in the series will need
to keep the distinction.

gas/
	* config/tc-aarch64.c (parse_address_main): Remove reloc and
	accept_reg_post_index parameters.  Parse relocations and register
	post indexes unconditionally.
	(parse_address): Remove accept_reg_post_index parameter.
	Update call to parse_address_main.
	(parse_address_reloc): Delete.
	(parse_operands): Call parse_address instead of parse_address_main.
	Update existing callers of parse_address and make them check
	inst.reloc.type where appropriate.
	* testsuite/gas/aarch64/diagnostic.s: Add tests for relocations
	in ADDR_SIMPLE, SIMD_ADDR_SIMPLE, ADDR_SIMM7 and ADDR_SIMM9 addresses.
	Also test for invalid uses of post-index register addressing.
	* testsuite/gas/aarch64/diagnostic.l: Update accordingly.
pipcet pushed a commit that referenced this issue Nov 14, 2016
This patch adds the new SVE integer immediate operands.  There are
three kinds:

- simple signed and unsigned ranges, but with new widths and positions.

- 13-bit logical immediates.  These have the same form as in base AArch64,
  but at a different bit position.

  In the case of the "MOV Zn.<T>, #<limm>" alias of DUPM, the logical
  immediate <limm> is not allowed to be a valid DUP immediate, since DUP
  is preferred over DUPM for constants that both instructions can handle.

- a new 9-bit arithmetic immediate, of the form "<imm8>{, LSL bminor#8}".
  In some contexts the operand is signed and in others it's unsigned.
  As an extension, we allow shifted immediates to be written as a single
  integer, e.g. "#256" is equivalent to "#1, LSL bminor#8".  We also use the
  shiftless form as the preferred disassembly, except for the special
  case of "#0, LSL bminor#8" (a redundant encoding of 0).

include/
	* opcode/aarch64.h (AARCH64_OPND_SIMM5): New aarch64_opnd.
	(AARCH64_OPND_SVE_AIMM, AARCH64_OPND_SVE_ASIMM)
	(AARCH64_OPND_SVE_INV_LIMM, AARCH64_OPND_SVE_LIMM)
	(AARCH64_OPND_SVE_LIMM_MOV, AARCH64_OPND_SVE_SHLIMM_PRED)
	(AARCH64_OPND_SVE_SHLIMM_UNPRED, AARCH64_OPND_SVE_SHRIMM_PRED)
	(AARCH64_OPND_SVE_SHRIMM_UNPRED, AARCH64_OPND_SVE_SIMM5)
	(AARCH64_OPND_SVE_SIMM5B, AARCH64_OPND_SVE_SIMM6)
	(AARCH64_OPND_SVE_SIMM8, AARCH64_OPND_SVE_UIMM3)
	(AARCH64_OPND_SVE_UIMM7, AARCH64_OPND_SVE_UIMM8)
	(AARCH64_OPND_SVE_UIMM8_53): Likewise.
	(aarch64_sve_dupm_mov_immediate_p): Declare.

opcodes/
	* aarch64-tbl.h (AARCH64_OPERANDS): Add entries for the new SVE
	integer immediate operands.
	* aarch64-opc.h (FLD_SVE_immN, FLD_SVE_imm3, FLD_SVE_imm5)
	(FLD_SVE_imm5b, FLD_SVE_imm7, FLD_SVE_imm8, FLD_SVE_imm9)
	(FLD_SVE_immr, FLD_SVE_imms, FLD_SVE_tszh): New aarch64_field_kinds.
	* aarch64-opc.c (fields): Add corresponding entries.
	(operand_general_constraint_met_p): Handle the new SVE integer
	immediate operands.
	(aarch64_print_operand): Likewise.
	(aarch64_sve_dupm_mov_immediate_p): New function.
	* aarch64-opc-2.c: Regenerate.
	* aarch64-asm.h (ins_inv_limm, ins_sve_aimm, ins_sve_asimm)
	(ins_sve_limm_mov, ins_sve_shlimm, ins_sve_shrimm): New inserters.
	* aarch64-asm.c (aarch64_ins_limm_1): New function, split out from...
	(aarch64_ins_limm): ...here.
	(aarch64_ins_inv_limm): New function.
	(aarch64_ins_sve_aimm): Likewise.
	(aarch64_ins_sve_asimm): Likewise.
	(aarch64_ins_sve_limm_mov): Likewise.
	(aarch64_ins_sve_shlimm): Likewise.
	(aarch64_ins_sve_shrimm): Likewise.
	* aarch64-asm-2.c: Regenerate.
	* aarch64-dis.h (ext_inv_limm, ext_sve_aimm, ext_sve_asimm)
	(ext_sve_limm_mov, ext_sve_shlimm, ext_sve_shrimm): New extractors.
	* aarch64-dis.c (decode_limm): New function, split out from...
	(aarch64_ext_limm): ...here.
	(aarch64_ext_inv_limm): New function.
	(decode_sve_aimm): Likewise.
	(aarch64_ext_sve_aimm): Likewise.
	(aarch64_ext_sve_asimm): Likewise.
	(aarch64_ext_sve_limm_mov): Likewise.
	(aarch64_top_bit): Likewise.
	(aarch64_ext_sve_shlimm): Likewise.
	(aarch64_ext_sve_shrimm): Likewise.
	* aarch64-dis-2.c: Regenerate.

gas/
	* config/tc-aarch64.c (parse_operands): Handle the new SVE integer
	immediate operands.
pipcet pushed a commit that referenced this issue Nov 14, 2016
If xmalloc fails allocating memory, usually because something tried a
huge allocation, like xmalloc(-1) or some such, GDB asks the user what
to do:

  .../src/gdb/utils.c:1079: internal-error: virtual memory exhausted.
  A problem internal to GDB has been detected,
  further debugging may prove unreliable.
  Quit this debugging session? (y or n)

If the user says "n", that throws a QUIT exception, which is caught by
one of the multiple CATCH(RETURN_MASK_ALL) blocks somewhere up the
stack.

The default implementations of operator new / operator new[] call
malloc directly, and on memory allocation failure throw
std::bad_alloc.  Currently, if that happens, since nothing catches it,
the exception escapes out of main, and GDB aborts from unhandled
exception.

This patch replaces the default operator new variants with versions
that, just like xmalloc:

 #1 - Raise an internal-error on memory allocation failure.

 #2 - Throw a QUIT gdb_exception, so that the exact same CATCH blocks
      continue handling memory allocation problems.

A minor complication of #2 is that operator new can _only_ throw
std::bad_alloc, or something that extends it:

  void* operator new (std::size_t size) throw (std::bad_alloc);

That means that if we let a gdb QUIT exception escape from within
operator new, the C++ runtime aborts due to unexpected exception
thrown.

So to bridge the gap, this patch adds a new gdb_quit_bad_alloc
exception type that inherits both std::bad_alloc and gdb_exception,
and throws _that_.

If we decide that we should be catching memory allocation errors in
fewer places than all the places we currently catch them (everywhere
we use RETURN_MASK_ALL currently), then we could change operator new
to throw plain std::bad_alloc then.  But I'm considering such a change
as separate matter from this one -- it'd make sense to do the same to
xmalloc at the same time, for instance.

Meanwhile, this allows using new/new[] instead of xmalloc/XNEW/etc.
without losing the "virtual memory exhausted" internal-error
safeguard.

Tested on x86_64 Fedora 23.

gdb/ChangeLog:
2016-09-23  Pedro Alves  <palves@redhat.com>

	* Makefile.in (SFILES): Add common/new-op.c.
	(COMMON_OBS): Add common/new-op.o.
	(new-op.o): New rule.
	* common/common-exceptions.h: Include <new>.
	(struct gdb_quit_bad_alloc): New type.
	* common/new-op.c: New file.

gdb/gdbserver/ChangeLog:
2016-09-23  Pedro Alves  <palves@redhat.com>

	* Makefile.in (SFILES): Add common/new-op.c.
	(OBS): Add common/new-op.o.
	(new-op.o): New rule.
pipcet pushed a commit that referenced this issue Nov 14, 2016
With this patch, when an inferior, thread or frame is explicitly
selected by the user, notifications will appear on all CLI and MI UIs.
When a GDB console is integrated in a front-end, this allows the
front-end to follow a selection made by the user ont he CLI, and it
informs the user about selection changes made behind the scenes by the
front-end.

This patch addresses PR gdb/20487.

In order to communicate frame changes to the front-end, this patch adds
a new field to the =thread-selected event for the selected frame.  The
idea is that since inferior/thread/frame can be seen as a composition,
it makes sense to send them together in the same event.  The vision
would be to eventually send the inferior information as well, if we find
that it's needed, although the "=thread-selected" event would be
ill-named for that job.

Front-ends need to handle this new field if they want to follow the
frame selection changes that originate from the console.  The format of
the frame attribute is the same as what is found in the *stopped events.

Here's a detailed example for each command and the events they generate:

thread
------

1. CLI command:

     thread 1.3

   MI event:

     =thread-selected,id="3",frame={...}

2. MI command:

     -thread-select 3

   CLI event:

     [Switching to thread 1.3 ...]

3. MI command (CLI-in-MI):

     thread 1.3

   MI event/reply:

     &"thread 1.3\n"
     ~"#0  child_sub_function () ...
     =thread-selected,id="3",frame={level="0",...}
     ^done

frame
-----

1. CLI command:

     frame 1

   MI event:

     =thread-selected,id="3",frame={level="1",...}

2. MI command:

     -stack-select-frame 1

   CLI event:

     #1  0x00000000004007f0 in child_function...

3. MI command (CLI-in-MI):

     frame 1

   MI event/reply:

     &"frame 1\n"
     ~"#1  0x00000000004007f9 in ..."
     =thread-selected,id="3",frame={level="1"...}
     ^done

inferior
--------

Inferior selection events only go from the console to MI, since there's
no way to select the inferior in pure MI.

1. CLI command:

     inferior 2

   MI event:

     =thread-selected,id="3"

Note that if the user selects an inferior that is not started or exited,
the MI doesn't receive a notification.  Since there is no threads to
select, the =thread-selected event does not apply...

2. MI command (CLI-in-MI):

     inferior 2

   MI event/reply:

     &"inferior 2\n"
     ~"[Switching to inferior 2 ...]"
     =thread-selected,id="4",frame={level="0"...}
     ^done

Internal implementation detail: this patch makes it possible to suppress
notifications caused by a CLI command, like what is done in mi-interp.c.
This means that it's now possible to use the
add_com_suppress_notification function to register a command with some
event suppressed.  It is used to implement the select-frame command in
this patch.

The function command_notifies_uscc_observer was added to extract
the rather complicated logical expression from the if statement.  It is
also now clearer what that logic does: if the command used by the user
already notifies the user_selected_context_changed observer, there is
not need to notify it again.  It therefore protects again emitting the
event twice.

No regressions, tested on ubuntu 14.04 x86 with target boards unix and
native-extended-gdbserver.

gdb/ChangeLog:

YYYY-MM-DD  Antoine Tremblay  <antoine.tremblay@ericsson.com>
YYYY-MM-DD  Simon Marchi  <simon.marchi@ericsson.com>

	PR gdb/20487
	* NEWS: Mention new frame field of =thread-selected event.
	* cli/cli-decode.c (add_cmd): Initialize c->suppress_notification.
	(add_com_suppress_notification): New function definition.
	(cmd_func): Set and restore the suppress_notification flag.
	* cli/cli-deicode.h (struct cmd_list_element)
	<suppress_notification>: New field.
	* cli/cli-interp.c (cli_suppress_notification): New global variable.
	(cli_on_user_selected_context_changed): New function.
	(_initialize_cli_interp): Attach to user_selected_context_changed
	observer.
	* command.h (struct cli_suppress_notification): New structure.
	(cli_suppress_notification): New global variable declaration.
	(add_com_suppress_notification): New function declaration.
	* defs.h (enum user_selected_what_flag): New enum.
	(user_selected_what): New enum flag type.
	* frame.h (print_stack_frame_to_uiout): New function declaration.
	* gdbthread.h (print_selected_thread_frame): New function declaration.
	* inferior.c (print_selected_inferior): New function definition.
	(inferior_command): Remove printing of inferior/thread/frame switch
	notifications, notify user_selected_context_changed observer.
	* inferior.h (print_selected_inferior): New function declaration.
	* mi/mi-cmds.c (struct mi_cmd): Add user_selected_context
	suppression to stack-select-frame and thread-select commands.
	* mi/mi-interp.c (struct mi_suppress_notification)
	<user_selected_context>: Initialize.
	(mi_user_selected_context_changed): New function definition.
	(_initialize_mi_interp): Attach to user_selected_context_changed.
	* mi/mi-main.c (mi_cmd_thread_select): Print thread selection reply.
	(mi_execute_command): Handle notification suppression.  Notify
	user_selected_context_changed observer on thread change instead of printing
	event directly.  Don't send it if command already sends the notification.
	(command_notifies_uscc_observer): New function.
	(mi_cmd_execute): Don't handle notification suppression.
	* mi/mi-main.h (struct mi_suppress_notification)
	<user_selected_context>: New field.
	* stack.c (print_stack_frame_to_uiout): New function definition.
	(select_frame_command): Notify user_selected_context_changed
	observer.
	(frame_command): Call print_selected_thread_frame if there's no frame
	change or notify user_selected_context_changed observer if there is.
	(up_command): Notify user_selected_context_changed observer.
	(down_command): Likewise.
	(_initialize_stack): Suppress user_selected_context notification for
	command select-frame.
	* thread.c (thread_command): Notify
	user_selected_context_changed if the thread has changed, print
	thread info directly if it hasn't.
	(do_captured_thread_select): Do not print thread switch event.
	(print_selected_thread_frame): New function definition.
	* tui/tui-interp.c (tui_on_user_selected_context_changed):
	New function definition.
	(_initialize_tui_interp): Attach to user_selected_context_changed
	observer.

gdb/doc/ChangeLog:

	PR gdb/20487
	* gdb.texinfo (Context management): Update mention of frame
	change notifications.
	(gdb/mi Async Records): Document frame field in
	=thread-select event.
	* observer.texi (GDB Observers): New user_selected_context_changed
	observer.

gdb/testsuite/ChangeLog:

	PR gdb/20487
	* gdb.mi/mi-pthreads.exp (check_mi_thread_command_set): Adapt
	=thread-select-event check.
pipcet pushed a commit that referenced this issue Nov 14, 2016
Even though this was supposedly in the gdb 7.2 timeframe, the testcase
in PR11094 crashes current GDB with a segfault:

  Program received signal SIGSEGV, Segmentation fault.
  0x00000000005ee894 in event_location_to_string (location=0x0) at
  src/gdb/location.c:412
  412       if (EL_STRING (location) == NULL)
  (top-gdb) bt
  #0  0x00000000005ee894 in event_location_to_string (location=0x0) at
  src/gdb/location.c:412
  #1  0x000000000057411a in print_breakpoint_location (b=0x18288e0, loc=0x0) at
  src/gdb/breakpoint.c:6201
  #2  0x000000000057483f in print_one_breakpoint_location (b=0x18288e0,
  loc=0x182cf10, loc_number=0, last_loc=0x7fffffffd258, allflag=1)
      at src/gdb/breakpoint.c:6473
  #3  0x00000000005751e1 in print_one_breakpoint (b=0x18288e0,
  last_loc=0x7fffffffd258, allflag=1) at
  src/gdb/breakpoint.c:6707
  #4  0x000000000057589c in breakpoint_1 (args=0x0, allflag=1, filter=0x0) at
  src/gdb/breakpoint.c:6947
  #5  0x0000000000575aa8 in maintenance_info_breakpoints (args=0x0, from_tty=0)
  at src/gdb/breakpoint.c:7026
  [...]

This is GDB trying to print the location spec of the JIT event
breakpoint, but that's an internal breakpoint without one.

If I add a NULL check, then we see that the JIT breakpoint is now
pending (because its location has shlib_disabled set):

  (gdb) maint info breakpoints
  Num     Type           Disp Enb Address            What
  [...]
  -8      jit events     keep y   <PENDING>           inf 1
  [...]

But that's incorrect.  GDB should have managed to recreate the JIT
breakpoint's location for the second run.  So the problem is
elsewhere.

The problem is that if the JIT loads at the same address on the second
run, we never recreate the JIT breakpoint, because we hit this early
return:

  static int
  jit_breakpoint_re_set_internal (struct gdbarch *gdbarch,
				  struct jit_program_space_data *ps_data)
  {
    [...]
    if (ps_data->cached_code_address == addr)
      return 0;

    [...]
      delete_breakpoint (ps_data->jit_breakpoint);
    [...]
    ps_data->jit_breakpoint = create_jit_event_breakpoint (gdbarch, addr);

Fix this by deleting the breakpoint and discarding the cached code
address when the objfile where the previous JIT breakpoint was found
is deleted/unloaded in the first place.

The test that was originally added for PR11094 doesn't trip on this
because:

  #1 - It doesn't test the case of the JIT descriptor's address _not_
       changing between reruns.

  #2 - And then it doesn't do "maint info breakpoints", or really
       anything with the JIT at all.

  #3 - and even then, to trigger the problem the JIT descriptor needs
       to be in a separate library, while the current test puts it in
       the main program.

The patch extends the test to cover all combinations of these
scenarios.

gdb/ChangeLog:
2016-10-06  Pedro Alves  <palves@redhat.com>

	* jit.c (free_objfile_data): Delete the JIT breakpoint and clear
	the cached code address.

gdb/testsuite/ChangeLog:
2016-10-06  Pedro Alves  <palves@redhat.com>

	* gdb.base/jit-simple-dl.c: New file.
	* gdb.base/jit-simple-jit.c: New file, factored out from ...
	* gdb.base/jit-simple.c: ... this.
	* gdb.base/jit-simple.exp (jit_run): Delete.
	(build_jit): New proc.
	(jit_test_reread): Recompile either the main program or the shared
	library, depending on what is being tested.  Skip changing address
	if caller wants to.  Compare before/after addresses.  If testing
	standalone, explicitly load the binary.  Test "maint info
	breakpoints".
	(top level): Add "standalone vs shared lib" and "change address"
	vs "same address" axes.
pipcet pushed a commit that referenced this issue Nov 14, 2016
Nowadays, if we build GDB with -fsanitize=address, we can get the asan
error below,

(gdb) quit
=================================================================
==9723==ERROR: AddressSanitizer: alloc-dealloc-mismatch (malloc vs operator delete) on 0x60200003bf70
    #0 0x7f88f3837527 in operator delete(void*) (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x55527)
    #1 0xac8e13 in __gnu_cxx::new_allocator<void (*)()>::deallocate(void (**)(), unsigned long) /usr/include/c++/4.9/ext/new_allocator.h:110
    #2 0xac8cc2 in __gnu_cxx::__alloc_traits<std::allocator<void (*)()> >::deallocate(std::allocator<void (*)()>&, void (**)(), unsigned long) /usr/include/c++/4.9/ext/alloc_traits.h:185
....
0x60200003bf70 is located 0 bytes inside of 8-byte region [0x60200003bf70,0x60200003bf78)
allocated by thread T0 here:
    #0 0x7f88f38367ef in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x547ef)
    #1 0xbd2762 in operator new(unsigned long) /home/yao/SourceCode/gnu/gdb/git/gdb/common/new-op.c:42
    #2 0xac8edc in __gnu_cxx::new_allocator<void (*)()>::allocate(unsigned long, void const*) /usr/include/c++/4.9/ext/new_allocator.h:104
    #3 0xac8d81 in __gnu_cxx::__alloc_traits<std::allocator<void (*)()> >::allocate(std::allocator<void (*)()>&, unsigned long) /usr/include/c++/4.9/ext/alloc_traits.h:182

The reason for this is that we override operator new but don't override
operator delete.  This patch does the override if the code is NOT
compiled with asan.

gdb:

2016-10-25  Yao Qi  <yao.qi@linaro.org>

	PR gdb/20716
	* common/new-op.c (__has_feature): New macro.
	Don't override operator new if asan is used.
pipcet pushed a commit that referenced this issue Nov 14, 2016
Currently GDB never sends more than one action per vCont packet, when
connected in non-stop mode.  A follow up patch will change that, and
it exposed a gdbserver problem with the vCont handling.

For example, this in non-stop mode:

  => vCont;s:p1.1;c
  <= OK

Should be equivalent to:

  => vCont;s:p1.1
  <= OK
  => vCont;c
  <= OK

But gdbserver currently doesn't handle this.  In the latter case,
"vCont;c" makes gdbserver clobber the previous step request.  This
patch fixes that.

Note the server side must ignore resume actions for the thread that
has a pending %Stopped notification (and any other threads with events
pending), until GDB acks the notification with vStopped.  Otherwise,
e.g., the following case is mishandled:

 #1 => g  (or any other packet)
 #2 <= [registers]
 #3 <= %Stopped T05 thread:p1.2
 #4 => vCont s:p1.1;c
 #5 <= OK

Above, the server must not resume thread p1.2 when it processes the
vCont.  GDB can't know that p1.2 stopped until it acks the %Stopped
notification.  (Otherwise it wouldn't send a default "c" action.)

(The vCont documentation already specifies this.)

Finally, special care must also be given to handling fork/vfork
events.  A (v)fork event actually tells us that two processes stopped
-- the parent and the child.  Until we follow the fork, we must not
resume the child.  Therefore, if we have a pending fork follow, we
must not send a global wildcard resume action (vCont;c).  We can still
send process-wide wildcards though.

(The comments above will be added as code comments to gdb in a follow
up patch.)

gdb/gdbserver/ChangeLog:
2016-10-26  Pedro Alves  <palves@redhat.com>

	* linux-low.c (handle_extended_wait): Link parent/child fork
	threads.
	(linux_wait_1): Unlink them.
	(linux_set_resume_request): Ignore resume requests for
	already-resumed and unhandled fork child threads.
	* linux-low.h (struct lwp_info) <fork_relative>: New field.
	* server.c (in_queued_stop_replies_ptid, in_queued_stop_replies):
	New functions.
	(handle_v_requests) <vCont>: Don't call require_running.
	* server.h (in_queued_stop_replies): New declaration.
pipcet pushed a commit that referenced this issue Nov 14, 2016
Most of the time, the trace should be in one piece.  This case is handled fine
by GDB.  In some cases, however, there may be gaps in the trace.  They result
from trace decode errors or from overflows.

A gap in the trace means we lost an unknown amount of trace.  Gaps can be very
small, such as a few instructions in the same function, or they can be rather
big.  We may, for example, lose a few function calls or returns.  The trace may
continue in a different function and we likely don't know how we got there.

Even though we can't say how the program executed across a gap, higher levels
may not be impacted too much by it.  Let's assume we have functions a-e and a
trace that looks roughly like this:

  a
   \
    b                    b
     \                  /
      c   <gap>        c
                      /
                 d   d
                  \ /
                   e

Even though we can't say for sure, it is likely that b and c are the same
function instance before and after the gap.  This patch is trying to connect
the c and b function segments across the gap.

This will add a to the back trace of b on the right hand side.  The changes are
reflected in GDB's internal representation of the trace and will improve:

  - the output of "record function-call-history /c"
  - the output of "backtrace" in replay mode
  - source stepping in replay mode
    will be improved indirectly via the improved back trace

I don't have an automated test for this patch; decode errors will be fixed and
overflows occur sporadically and are quite rare.  I tested it by hacking GDB to
provoke a decode error and on the expected gap in the gdb.btrace/dlopen.exp
test.

The issue is that we can't predict where we will be able to re-sync in case of
errors.  For the expected decode error in gdb.btrace/dlopen.exp, for example, we
may be able to re-sync somewhere in dlclose, in test, in main, or not at all.

Here's one example run of gdb.btrace/dlopen.exp with and without this patch.

    (gdb) info record
    Active record target: record-btrace
    Recording format: Intel Processor Trace.
    Buffer size: 16kB.
    warning: Non-contiguous trace at instruction 66608 (offset = 0xa83, pc = 0xb7fdcc31).
    warning: Non-contiguous trace at instruction 66652 (offset = 0xa9b, pc = 0xb7fdcc31).
    warning: Non-contiguous trace at instruction 66770 (offset = 0xacb, pc = 0xb7fdcc31).
    warning: Non-contiguous trace at instruction 66966 (offset = 0xb60, pc = 0xb7ff5ee4).
    warning: Non-contiguous trace at instruction 66994 (offset = 0xb74, pc = 0xb7ff5f24).
    warning: Non-contiguous trace at instruction 67334 (offset = 0xbac, pc = 0xb7ff5e6d).
    warning: Non-contiguous trace at instruction 69022 (offset = 0xc04, pc = 0xb7ff60b3).
    warning: Non-contiguous trace at instruction 69116 (offset = 0xc1c, pc = 0xb7ff60b3).
    warning: Non-contiguous trace at instruction 69504 (offset = 0xc74, pc = 0xb7ff605d).
    warning: Non-contiguous trace at instruction 83648 (offset = 0xecc, pc = 0xb7ff6134).
    warning: Decode error (-13) at instruction 83876 (offset = 0xf48, pc = 0xb7fd6380): no memory mapped at this address.
    warning: Non-contiguous trace at instruction 83876 (offset = 0x11b7, pc = 0xb7ff1c70).
    Recorded 83948 instructions in 912 functions (12 gaps) for thread 1 (process 12996).
    (gdb) record instruction-history 83876, +2
    83876   => 0xb7fec46f <call_init.part.0+95>:    call   *%eax
    [decode error (-13): no memory mapped at this address]
    [disabled]
    83877      0xb7ff1c70 <_dl_close_worker.part.0+1584>:   nop

Without the patch, the trace is disconnected and the backtrace is short:

    (gdb) record goto 83876
    #0  0xb7fec46f in call_init.part () from /lib/ld-linux.so.2
    (gdb) backtrace
    #0  0xb7fec46f in call_init.part () from /lib/ld-linux.so.2
    #1  0xb7fec5d0 in _dl_init () from /lib/ld-linux.so.2
    #2  0xb7ff0fe3 in dl_open_worker () from /lib/ld-linux.so.2
    Backtrace stopped: not enough registers or memory available to unwind further
    (gdb) record goto 83877
    #0  0xb7ff1c70 in _dl_close_worker.part.0 () from /lib/ld-linux.so.2
    (gdb) backtrace
    #0  0xb7ff1c70 in _dl_close_worker.part.0 () from /lib/ld-linux.so.2
    #1  0xb7ff287a in _dl_close () from /lib/ld-linux.so.2
    #2  0xb7fc3d5d in dlclose_doit () from /lib/libdl.so.2
    #3  0xb7fec354 in _dl_catch_error () from /lib/ld-linux.so.2
    #4  0xb7fc43dd in _dlerror_run () from /lib/libdl.so.2
    #5  0xb7fc3d98 in dlclose () from /lib/libdl.so.2
    #6  0x0804860a in test ()
    #7  0x08048628 in main ()

With the patch, GDB is able to connect the trace pieces and we get a full
backtrace.

    (gdb) record goto 83876
    #0  0xb7fec46f in call_init.part () from /lib/ld-linux.so.2
    (gdb) backtrace
    #0  0xb7fec46f in call_init.part () from /lib/ld-linux.so.2
    #1  0xb7fec5d0 in _dl_init () from /lib/ld-linux.so.2
    #2  0xb7ff0fe3 in dl_open_worker () from /lib/ld-linux.so.2
    #3  0xb7fec354 in _dl_catch_error () from /lib/ld-linux.so.2
    #4  0xb7ff02e2 in _dl_open () from /lib/ld-linux.so.2
    #5  0xb7fc3c65 in dlopen_doit () from /lib/libdl.so.2
    #6  0xb7fec354 in _dl_catch_error () from /lib/ld-linux.so.2
    #7  0xb7fc43dd in _dlerror_run () from /lib/libdl.so.2
    bminor#8  0xb7fc3d0e in dlopen@@GLIBC_2.1 () from /lib/libdl.so.2
    bminor#9  0xb7ff28ee in _dl_runtime_resolve () from /lib/ld-linux.so.2
    bminor#10 0x0804841c in ?? ()
    bminor#11 0x08048470 in dlopen@plt ()
    bminor#12 0x080485a3 in test ()
    #13 0x08048628 in main ()
    (gdb) record goto 83877
    #0  0xb7ff1c70 in _dl_close_worker.part.0 () from /lib/ld-linux.so.2
    (gdb) backtrace
    #0  0xb7ff1c70 in _dl_close_worker.part.0 () from /lib/ld-linux.so.2
    #1  0xb7ff287a in _dl_close () from /lib/ld-linux.so.2
    #2  0xb7fc3d5d in dlclose_doit () from /lib/libdl.so.2
    #3  0xb7fec354 in _dl_catch_error () from /lib/ld-linux.so.2
    #4  0xb7fc43dd in _dlerror_run () from /lib/libdl.so.2
    #5  0xb7fc3d98 in dlclose () from /lib/libdl.so.2
    #6  0x0804860a in test ()
    #7  0x08048628 in main ()

It worked nicely in this case but it may, of course, also lead to weird
connections; it is a heuristic, after all.

It works best when the gap is small and the trace pieces are long.

gdb/
	* btrace.c (bfun_s): New typedef.
	(ftrace_update_caller): Print caller in debug dump.
	(ftrace_get_caller, ftrace_match_backtrace, ftrace_fixup_level)
	(ftrace_compute_global_level_offset, ftrace_connect_bfun)
	(ftrace_connect_backtrace, ftrace_bridge_gap, btrace_bridge_gaps): New.
	(btrace_compute_ftrace_bts): Pass vector of gaps.  Collect gaps.
	(btrace_compute_ftrace_pt): Likewise.
	(btrace_compute_ftrace): Split into this, ...
	(btrace_compute_ftrace_1): ... this, and ...
	(btrace_finalize_ftrace): ... this.  Call btrace_bridge_gaps.
pipcet pushed a commit that referenced this issue Dec 1, 2016
… frame

This patch ensures that the frame id for the current frame is stashed
before that of the previous frame (to the current frame).

First, it should be noted that the frame id for the current frame is
not stashed by get_current_frame().  The current frame's frame id is
lazily computed and stashed via calls to get_frame_id().  However,
it's possible for get_prev_frame() to be called without first stashing
the current frame.

The frame stash is used not only to speed up frame lookups, but
also to detect cycles.  When attempting to compute the frame id
for a "previous" frame (in get_prev_frame_if_no_cycle), a cycle
is detected if the computed frame id is already in the stash.

If it should happen that a previous frame id is stashed which should
represent a cycle for the current frame, then an assertion failure
will trigger should get_frame_id() be later called to determine
the frame id for the current frame.

As of late 2016, with the "Tweak meaning of VALUE_FRAME_ID" patch in
place, this actually occurs when running the
gdb.dwarf2/dw2-dup-frame.exp test.  While attempting to generate a
backtrace, the python frame filter code is invoked, leading to
frame_info_to_frame_object() (in python/py-frame.c) being called.
That function will potentially call get_prev_frame() before
get_frame_id() is called.  The call to get_prev_frame() can eventually
end up in get_prev_frame_if_no_cycle() which, in turn, calls
compute_frame_id(), after which the frame id is stashed for the
previous frame.

If the frame id for the current frame is stashed, the cycle detection
code (which relies on the frame stash) in get_prev_frame_if_no_cycle()
will be triggered for a cycle starting with the current frame.  If the
current frame's id is not stashed, the cycle detecting code can't
operate as designed.  Instead, when get_frame_id() is called on the
current frame at some later point, the current frame's id will found
to be already in the stash, triggering an assertion failure.

Below is an in depth examination of the failure which lead to this change.
I've shortened pathnames for brevity and readability.

Here's the portion of the log file showing the failure/internal error:

(gdb) break stop_frame
Breakpoint 1 at 0x40059a: file dw2-dup-frame.c, line 22.
(gdb) run
Starting program: testsuite/outputs/gdb.dwarf2/dw2-dup-frame/dw2-dup-frame

Breakpoint 1, stop_frame () at dw2-dup-frame.c:22
22	}
(gdb) bt
gdb/frame.c:544: internal-error: frame_id get_frame_id(frame_info*): Assertion `stashed' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)
FAIL: gdb.dwarf2/dw2-dup-frame.exp: backtrace from stop_frame (GDB internal error)

Here's a partial backtrace from the internal error, showing the frames
which I think are relevant, plus several extra to provide context:

    #0  internal_error (
	file=0x932b98 "gdb/frame.c", line=544,
	fmt=0x932b20 "%s: Assertion `%s' failed.")
	at gdb/common/errors.c:54
    #1  0x000000000072207e in get_frame_id (fi=0xe5a760)
	at gdb/frame.c:544
    #2  0x00000000004eb50d in frame_info_to_frame_object (frame=0xe5a760)
	at gdb/python/py-frame.c:390
    #3  0x00000000004ef5be in bootstrap_python_frame_filters (frame=0xe5a760,
	frame_low=0, frame_high=-1)
	at gdb/python/py-framefilter.c:1453
    #4  0x00000000004ef7a9 in gdbpy_apply_frame_filter (
	extlang=0x8857e0 <extension_language_python>, frame=0xe5a760, flags=7,
	args_type=CLI_SCALAR_VALUES, out=0xf6def0, frame_low=0, frame_high=-1)
	at gdb/python/py-framefilter.c:1548
    #5  0x00000000005f2c5a in apply_ext_lang_frame_filter (frame=0xe5a760,
	flags=7, args_type=CLI_SCALAR_VALUES, out=0xf6def0, frame_low=0,
	frame_high=-1)
	at gdb/extension.c:572
    #6  0x00000000005ea896 in backtrace_command_1 (count_exp=0x0, show_locals=0,
	no_filters=0, from_tty=1)
	at gdb/stack.c:1834

Examination of the code in frame_info_to_frame_object(), which is in
python/py-frame.c, is key to understanding this problem:

      if (get_prev_frame (frame) == NULL
	  && get_frame_unwind_stop_reason (frame) != UNWIND_NO_REASON
	  && get_next_frame (frame) != NULL)
	{
	  frame_obj->frame_id = get_frame_id (get_next_frame (frame));
	  frame_obj->frame_id_is_next = 1;
	}
      else
	{
	  frame_obj->frame_id = get_frame_id (frame);
	  frame_obj->frame_id_is_next = 0;
	}

I will first note that the frame id for frame has not been computed yet.  (This
was verified by placing a breakpoint on compute_frame_id().)

The call to get_prev_frame() causes the the frame id to (eventually) be
computed for the previous frame.  Here's a backtrace showing how we
get there:

    #0  compute_frame_id (fi=0x10e2810)
	at gdb/frame.c:496
    #1  0x0000000000724a67 in get_prev_frame_if_no_cycle (this_frame=0xe5a760)
	at gdb/frame.c:1871
    #2  0x0000000000725136 in get_prev_frame_always_1 (this_frame=0xe5a760)
	at gdb/frame.c:2045
    #3  0x000000000072516b in get_prev_frame_always (this_frame=0xe5a760)
	at gdb/frame.c:2061
    #4  0x000000000072570f in get_prev_frame (this_frame=0xe5a760)
	at gdb/frame.c:2303
    #5  0x00000000004eb471 in frame_info_to_frame_object (frame=0xe5a760)
	at gdb/python/py-frame.c:381

For this particular case, we end up in the else clause of the code above
which calls get_frame_id (frame).  It's at this point that the frame id
for frame is computed.  Again, here's a backtrace:

    #0  compute_frame_id (fi=0xe5a760)
	at gdb/frame.c:496
    #1  0x000000000072203d in get_frame_id (fi=0xe5a760)
	at gdb/frame.c:539
    #2  0x00000000004eb50d in frame_info_to_frame_object (frame=0xe5a760)
	at gdb/python/py-frame.c:390

The test in question, dw2-dup-frame.exp, deliberately creates a broken
(cyclic) stack.  So, in this instance, the frame id for the prev
`frame' will be the same as that for `frame'.  But that particular
frame id ended up in the stash during the previous frame operation.
When, just a few lines later, we compute the frame id for `frame', the
id in question is already in the stash, thus triggering the assertion
failure.

I considered two other solutions to solving this problem:

We could prevent get_prev_frame() from being called before
get_frame_id() in frame_info_to_frame_object().  (See above for the
snippet of code where this happens.) A call to get_frame_id (frame)
could be placed ahead of that code snippet above.  I have tested this
approach and, while it does work, I can't be certain that
get_prev_frame() isn't called ahead of stashing the current frame
somewhere else in GDB, but in a less obvious way.

Another approach is to stash the current frame's id by calling
get_frame_id() in get_current_frame().  This approach is conceptually
simpler, but when importing a python unwinder, has the unwelcome side
effect of causing the unwinder to be called during import.

A cleaner looking fix would be to place this code after code
corresponding to the "Don't compute the frame id of the current frame
yet..." comment in get_prev_frame_if_no_cycle().  Sadly, this does not
work though; by the time we get to this point, the frame state for the
prev frame has been modified just enough to cause an internal error to
occur when attempting to compute the (current) frame id for inline
frames.  (The unexpected failure count increases by roughly 130
failures.)  Therefore, I decided to place it as early as possible
in get_prev_frame().

gdb/ChangeLog:

	* frame.c (get_prev_frame): Stash frame id for current frame
	prior to computing frame id for previous frame.
pipcet pushed a commit that referenced this issue Dec 1, 2016
This patch fixes a few problems with GDB's time handling.

#1 - It avoids problems with gnulib's C++ namespace support

On MinGW, the struct timeval that should be passed to gnulib's
gettimeofday replacement is incompatible with libiberty's
timeval_sub/timeval_add.  That's because gnulib also replaces "struct
timeval" with its own definition, while libiberty expects the
system's.

E.g., in code like this:

  gettimeofday (&prompt_ended, NULL);
  timeval_sub (&prompt_delta, &prompt_ended, &prompt_started);
  timeval_add (&prompt_for_continue_wait_time,
               &prompt_for_continue_wait_time, &prompt_delta);

That's currently handled in gdb by not using gnulib's gettimeofday at
all (see common/gdb_sys_time.h), but that #undef hack won't work with
if/when we enable gnulib's C++ namespace support, because that mode
adds compile time warnings for uses of ::gettimeofday, which are hard
errors with -Werror.

#2 - But there's an elephant in the room: gettimeofday is not monotonic...

We're using it to:

  a) check how long functions take, for performance analysis
  b) compute when in the future to fire events in the event-loop
  c) print debug timestamps

But that's exactly what gettimeofday is NOT meant for.  Straight from
the man page:

~~~
       The time returned by gettimeofday() is affected by
       discontinuous jumps in the system time (e.g., if the system
       administrator manually changes the system time).  If you need a
       monotonically increasing clock, see clock_gettime(2).
~~~

std::chrono (part of the C++11 standard library) has a monotonic clock
exactly for such purposes (std::chrono::steady_clock).  This commit
switches to use that instead of gettimeofday, fixing all the issues
mentioned above.

gdb/ChangeLog:
2016-11-23  Pedro Alves  <palves@redhat.com>

	* Makefile.in (SFILES): Add common/run-time-clock.c.
	(HFILES_NO_SRCDIR): Add common/run-time-clock.h.
	(COMMON_OBS): Add run-time-clock.o.
	* common/run-time-clock.c, common/run-time-clock.h: New files.
	* defs.h (struct timeval, print_transfer_performance): Delete
	declarations.
	* event-loop.c (struct gdb_timer) <when>: Now a
	std::chrono::steady_clock::time_point.
	(create_timer): use std::chrono::steady_clock instead of
	gettimeofday.  Use new instead of malloc.
	(delete_timer): Use delete instead of xfree.
	(duration_cast_timeval): New.
	(update_wait_timeout): Use std::chrono::steady_clock instead of
	gettimeofday.
	* maint.c: Include <chrono> instead of "gdb_sys_time.h", <time.h>
	and "timeval-utils.h".
	(scoped_command_stats::~scoped_command_stats)
	(scoped_command_stats::scoped_command_stats): Use
	std::chrono::steady_clock instead of gettimeofday.  Use
	user_cpu_time_clock instead of get_run_time.
	* maint.h: Include "run-time-clock.h" and <chrono>.
	(scoped_command_stats): <m_start_cpu_time>: Now a
	user_cpu_time_clock::time_point.
	<m_start_wall_time>: Now a std::chrono::steady_clock::time_point.
	* mi/mi-main.c: Include "run-time-clock.h" and <chrono> instead of
	"gdb_sys_time.h" and <sys/resource.h>.
	(rusage): Delete.
	(mi_execute_command): Use new instead of XNEW.
	(mi_load_progress): Use std::chrono::steady_clock instead of
	gettimeofday.
	(timestamp): Rewrite in terms of std::chrono::steady_clock,
	user_cpu_time_clock and system_cpu_time_clock.
	(timeval_diff): Delete.
	(print_diff): Adjust to use std::chrono::steady_clock,
	user_cpu_time_clock and system_cpu_time_clock.
	* mi/mi-parse.h: Include "run-time-clock.h" and <chrono> instead
	of "gdb_sys_time.h".
	(struct mi_timestamp): Change fields types to
	std::chrono::steady_clock::time_point, user_cpu_time_clock::time
	and system_cpu_time_clock::time_point, instead of struct timeval.
	* symfile.c: Include <chrono> instead of <time.h> and
	"gdb_sys_time.h".
	(struct time_range): New.
	(generic_load): Use std::chrono::steady_clock instead of
	gettimeofday.
	(print_transfer_performance): Replace timeval parameters with a
	std::chrono::steady_clock::duration parameter.  Adjust.
	* utils.c: Include <chrono> instead of "timeval-utils.h",
	"gdb_sys_time.h", and <time.h>.
	(prompt_for_continue_wait_time): Now a
	std::chrono::steady_clock::duration.
	(defaulted_query, prompt_for_continue): Use
	std::chrono::steady_clock instead of
	gettimeofday/timeval_sub/timeval_add.
	(reset_prompt_for_continue_wait_time): Use
	std::chrono::steady_clock::duration instead of struct timeval.
	(get_prompt_for_continue_wait_time): Return a
	std::chrono::steady_clock::duration instead of struct timeval.
	(vfprintf_unfiltered): Use std::chrono::steady_clock instead of
	gettimeofday.  Use std::string.  Use '.' instead of ':'.
	* utils.h: Include <chrono>.
	(get_prompt_for_continue_wait_time): Return a
	std::chrono::steady_clock::duration instead of struct timeval.

gdb/gdbserver/ChangeLog:
2016-11-23  Pedro Alves  <palves@redhat.com>

	* debug.c: Include <chrono> instead of "gdb_sys_time.h".
	(debug_vprintf): Use std::chrono::steady_clock instead of
	gettimeofday.  Use '.' instead of ':'.
	* tracepoint.c: Include <chrono> instead of "gdb_sys_time.h".
	(get_timestamp): Use std::chrono::steady_clock instead of
	gettimeofday.
pipcet pushed a commit that referenced this issue Feb 17, 2017
…binations

This adds a test that exposes several problems fixed by earlier
patches:

#1 - Buffer overrun when host/target formats match, but sizes don't.
     https://sourceware.org/ml/gdb-patches/2016-03/msg00125.html

#2 - Missing handling for FR-V FR300.
     https://sourceware.org/ml/gdb-patches/2016-03/msg00117.html

#3 - BFD architectures with spaces in their names (v850).
     https://sourceware.org/ml/binutils/2016-03/msg00108.html

#4 - The OS ABI names with spaces issue.
     https://sourceware.org/ml/gdb-patches/2016-03/msg00116.html

#5 - Bogus HP/PA long double format.
     https://sourceware.org/ml/gdb-patches/2016-03/msg00122.html

#6 - Cris big endian internal error.
     https://sourceware.org/ml/gdb-patches/2016-03/msg00126.html

#7 - Several PowerPC bfd archs/machines not handled by gdb.
     https://sourceware.org/bugzilla/show_bug.cgi?id=19797

And hopefully helps catch others in the future.

This started out as a test that simply did,

 gdb -ex "print 1.0L"

to exercise #1 above.

Then to cover both 32-bit target / 64-bit host and the converse, I
thought of having the testcase print the floats twice, once with the
architecture set to "i386" and then to "i386:x86-64".  This way it
wouldn't matter whether gdb was built as 32-bit or a 64-bit program.

Then I thought that other archs might have similar host/target
floatformat conversion issues as well.  Instead of hardcoding some
architectures in the test file, I thought we could just iterate over
all bfd architectures and OS ABIs supported by the gdb build being
tested.  This is what then exposed all the other problems listed
above...

With an --enable-targets=all, this exercises over 14 thousand
combinations.  If left in a single test file, it all consistenly runs
in under a minute on my machine (An Intel i7-4810MQ @ 2.8 MHZ running
Fedora 23).  Split in 8 chunks, as in this commit, it runs in around
25 seconds, with make -j8.

To avoid flooding the gdb.sum file, it avoids calling "pass" on each
tested combination/iteration.  I'm explicitly not implementing that by
passing an empty message to gdb_test / gdb_test_multiple, because I
still want a FAIL to be logged in gdb.sum.  So instead this puts the
internal passes in the gdb.log file, only, prefixed "IPASS:", for
internal pass.  TBC, if some iteration fails, it'll still show up as
FAIL in gdb.sum.  If this is an approach that takes on, I can see us
extending the common bits to support it for all testcases.

gdb/testsuite/ChangeLog:
2016-12-09  Pedro Alves  <palves@redhat.com>

	* gdb.base/all-architectures-0.exp: New file.
	* gdb.base/all-architectures-1.exp: New file.
	* gdb.base/all-architectures-2.exp: New file.
	* gdb.base/all-architectures-3.exp: New file.
	* gdb.base/all-architectures-4.exp: New file.
	* gdb.base/all-architectures-5.exp: New file.
	* gdb.base/all-architectures-6.exp: New file.
	* gdb.base/all-architectures-7.exp: New file.
	* gdb.base/all-architectures.exp.in: New file.
pipcet pushed a commit that referenced this issue Feb 3, 2021
With "target extended-remote" + "maint set target-non-stop", attaching
hangs like so:

 (gdb) attach 1244450
 Attaching to process 1244450
 [New Thread 1244450.1244450]
 [New Thread 1244450.1244453]
 [New Thread 1244450.1244454]
 [New Thread 1244450.1244455]
 [New Thread 1244450.1244456]
 [New Thread 1244450.1244457]
 [New Thread 1244450.1244458]
 [New Thread 1244450.1244459]
 [New Thread 1244450.1244461]
 [New Thread 1244450.1244462]
 [New Thread 1244450.1244463]
 * hang *

Attaching to the hung GDB shows that GDB is busy in an infinite loop
in stop_all_threads:

 (top-gdb) bt
 #0  stop_all_threads () at /home/pedro/gdb/binutils-gdb/src/gdb/infrun.c:4755
 #1  0x000055555597b424 in stop_waiting (ecs=0x7fffffffd930) at /home/pedro/gdb/binutils-gdb/src/gdb/infrun.c:7738
 #2  0x0000555555976fba in handle_signal_stop (ecs=0x7fffffffd930) at /home/pedro/gdb/binutils-gdb/src/gdb/infrun.c:5868
 #3  0x0000555555975f6a in handle_inferior_event (ecs=0x7fffffffd930) at /home/pedro/gdb/binutils-gdb/src/gdb/infrun.c:5527
 #4  0x0000555555971da4 in fetch_inferior_event () at /home/pedro/gdb/binutils-gdb/src/gdb/infrun.c:3910
 #5  0x00005555559540b2 in inferior_event_handler (event_type=INF_REG_EVENT) at /home/pedro/gdb/binutils-gdb/src/gdb/inf-loop.c:42
 #6  0x000055555597e825 in infrun_async_inferior_event_handler (data=0x0) at /home/pedro/gdb/binutils-gdb/src/gdb/infrun.c:9162
 #7  0x0000555555687d1d in check_async_event_handlers () at /home/pedro/gdb/binutils-gdb/src/gdb/async-event.c:328
 bminor#8  0x0000555555e48284 in gdb_do_one_event () at /home/pedro/gdb/binutils-gdb/src/gdbsupport/event-loop.cc:216
 bminor#9  0x00005555559e7512 in start_event_loop () at /home/pedro/gdb/binutils-gdb/src/gdb/main.c:347
 bminor#10 0x00005555559e765d in captured_command_loop () at /home/pedro/gdb/binutils-gdb/src/gdb/main.c:407
 bminor#11 0x00005555559e8f80 in captured_main (data=0x7fffffffdb70) at /home/pedro/gdb/binutils-gdb/src/gdb/main.c:1239
 bminor#12 0x00005555559e8ff2 in gdb_main (args=0x7fffffffdb70) at /home/pedro/gdb/binutils-gdb/src/gdb/main.c:1254
 #13 0x0000555555627c86 in main (argc=12, argv=0x7fffffffdc88) at /home/pedro/gdb/binutils-gdb/src/gdb/gdb.c:32

The problem is that the remote sends stops for all the threads:

 Packet received: l/home/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.threads/attach-non-stop/attach-non-stop
 Sending packet: $vStopped#55...Packet received: T0006:f06e25edec7f0000;07:f06e25edec7f0000;10:f14190ccf4550000;thread:p12fd22.12fd2f;core:15;
 Sending packet: $vStopped#55...Packet received: T0006:f0dea5f0ec7f0000;07:f0dea5f0ec7f0000;10:e84190ccf4550000;thread:p12fd22.12fd27;core:4;
 Sending packet: $vStopped#55...Packet received: T0006:f0ee25f1ec7f0000;07:f0ee25f1ec7f0000;10:f14190ccf4550000;thread:p12fd22.12fd26;core:5;
 Sending packet: $vStopped#55...Packet received: T0006:f0bea5efec7f0000;07:f0bea5efec7f0000;10:f14190ccf4550000;thread:p12fd22.12fd29;core:1;
 Sending packet: $vStopped#55...Packet received: T0006:f0ce25f0ec7f0000;07:f0ce25f0ec7f0000;10:e84190ccf4550000;thread:p12fd22.12fd28;core:a;
 Sending packet: $vStopped#55...Packet received: T0006:f07ea5edec7f0000;07:f07ea5edec7f0000;10:e84190ccf4550000;thread:p12fd22.12fd2e;core:f;
 Sending packet: $vStopped#55...Packet received: T0006:f0ae25efec7f0000;07:f0ae25efec7f0000;10:df4190ccf4550000;thread:p12fd22.12fd2a;core:6;
 Sending packet: $vStopped#55...Packet received: T0006:0000000000000000;07:c0e8a381fe7f0000;10:bf43b4f1ec7f0000;thread:p12fd22.12fd22;core:2;
 Sending packet: $vStopped#55...Packet received: T0006:f0fea5f1ec7f0000;07:f0fea5f1ec7f0000;10:df4190ccf4550000;thread:p12fd22.12fd25;core:8;
 Sending packet: $vStopped#55...Packet received: T0006:f09ea5eeec7f0000;07:f09ea5eeec7f0000;10:e84190ccf4550000;thread:p12fd22.12fd2b;core:b;
 Sending packet: $vStopped#55...Packet received: OK

But then wait_one never consumes them, always hitting this path:

 4473          if (nfds == 0)
 4474            {
 4475              /* No waitable targets left.  All must be stopped.  */
 4476              return {NULL, minus_one_ptid, {TARGET_WAITKIND_NO_RESUMED}};
 4477            }

Resulting in GDB constanly calling target_stop to stop threads, but
the remote target never reporting back the stops to infrun.

That TARGET_WAITKIND_NO_RESUMED path shown above is always taken
because here, in wait_one too, just above:

 4428          for (inferior *inf : all_inferiors ())
 4429            {
 4430              process_stratum_target *target = inf->process_target ();
 4431              if (target == NULL
 4432                  || !target->is_async_p ()
                           ^^^^^^^^^^^^^^^^^^^^^
 4433                  || !target->threads_executing)
 4434                continue;

... the remote target is not async.

And in turn that happened because extended_remote_target::attach
misses enabling async in the target-non-stop path.

A testcase exercising this will be added in a following patch.

gdb/ChangeLog:

	* remote.c (extended_remote_target::attach): Set target async in
	the target-non-stop path too.
pipcet pushed a commit that referenced this issue Feb 3, 2021
A following patch will add a new testcase that has two processes, each
with a number of threads constantly tripping a breakpoint and stepping
over it, because the breakpoint has a condition that evals false.
Then GDB detaches from one of the processes, while both processes are
running.  And then the testcase sends a SIGUSR1 to the other process.

When run against gdbserver, that would occasionaly fail like this:

 (gdb) PASS: gdb.threads/detach-step-over.exp: iter 1: detach
 Executing on target: kill -SIGUSR1 208303    (timeout = 300)
 spawn -ignore SIGHUP kill -SIGUSR1 208303

 Thread 2.5 "detach-step-ove" received signal SIGTRAP, Trace/breakpoint trap.
 [Switching to Thread 208303.208305]
 0x000055555555522a in thread_func (arg=0x0) at /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.threads/detach-step-over.c:54
 54            counter++; /* Set breakpoint here.  */

What happened was that GDBserver is doing a step-over for process A
when a detach request for process B arrives.  And that generates a
spurious SIGTRAP report for process A, as seen above.

The GDBserver logs reveal what happened:

 - GDB manages to detach while a step over is in progress.  That reaches
   linux_process_target::complete_ongoing_step_over(), which does:

      /* Passing NULL_PTID as filter indicates we want all events to
	 be left pending.  Eventually this returns when there are no
	 unwaited-for children left.  */
      ret = wait_for_event_filtered (minus_one_ptid, null_ptid, &wstat,
				     __WALL);

   As the comment say, this leaves all events pending, _including_ the
   just finished step SIGTRAP.  We never discard that SIGTRAP.  So
   GDBserver reports the SIGTRAP to GDB.  GDB can't explain the
   SIGTRAP, so it reports it to the user.

The GDBserver log looks like this.  The LWP of interest is 208305:

 Need step over [LWP 208305]? yes, found breakpoint at 0x555555555227
 proceed_all_lwps: found thread 208305 needing a step-over
 Starting step-over on LWP 208305.  Stopping all threads

208305 starts a step-over.

 >>>> entering void linux_process_target::stop_all_lwps(int, lwp_info*)
 stop_all_lwps (stop-and-suspend, except=LWP 208303.208305)
 Sending sigstop to lwp 208303
 Sending sigstop to lwp 207755
 wait_for_sigstop: pulling events
 LWFE: waitpid(-1, ...) returned 207755, ERRNO-OK
 LLW: waitpid 207755 received Stopped (signal) (stopped)
 pc is 0x7f7e045593bf
 Expected stop.
 LLW: SIGSTOP caught for LWP 207755.207755 while stopping threads.
 LWFE: waitpid(-1, ...) returned 208303, ERRNO-OK
 LLW: waitpid 208303 received Stopped (signal) (stopped)
 pc is 0x7ffff7e743bf
 Expected stop.
 LLW: SIGSTOP caught for LWP 208303.208303 while stopping threads.
 LWFE: waitpid(-1, ...) returned 0, ERRNO-OK
 leader_pid=208303, leader_lp!=NULL=1, num_lwps=11, zombie=0
 leader_pid=207755, leader_lp!=NULL=1, num_lwps=11, zombie=0
 LLW: exit (no unwaited-for LWP)
 stop_all_lwps done, setting stopping_threads back to !stopping
 <<<< exiting void linux_process_target::stop_all_lwps(int, lwp_info*)
 Done stopping all threads for step-over.
 pc is 0x555555555227
 Writing 8b to 0x555555555227 in process 208305
 Could not findsigchld_handler
  fast tracepoint jump at 0x555555555227 in list (uninserting).
   pending reinsert at 0x555555555227
   step from pc 0x555555555227
 Resuming lwp 208305 (step, signal 0, stop expected)
 <<<< exiting ptid_t linux_process_target::wait_1(ptid_t, target_waitstatus*, target_wait_flags)
 handling possible serial event
 getpkt ("D;32b8b");  [no ack sent]

The detach request arrives.

 sigchld_handler
 Tracing is already off, ignoring
 detach: step over in progress, finish it first

GDBserver realizes a step over for 208305 was in progress, let's it
finish.

 LWFE: waitpid(-1, ...) returned 208305, ERRNO-OK
 LLW: waitpid 208305 received Stopped (signal) (stopped)
 pc is 0x555555555227
 Expected stop.
 LLW: step LWP 208303.208305, 0, 0 (discard delayed SIGSTOP)
   pending reinsert at 0x555555555227
   step from pc 0x555555555227
 Resuming lwp 208305 (step, signal 0, stop not expected)
 LWFE: waitpid(-1, ...) returned 0, ERRNO-OK
 leader_pid=208303, leader_lp!=NULL=1, num_lwps=11, zombie=0
 leader_pid=207755, leader_lp!=NULL=1, num_lwps=11, zombie=0
 sigsuspend'ing
 LWFE: waitpid(-1, ...) returned 208305, ERRNO-OK
 LLW: waitpid 208305 received Trace/breakpoint trap (stopped)
 pc is 0x55555555522a
 CSBB: LWP 208303.208305 stopped by trace
 LWFE: waitpid(-1, ...) returned 0, ERRNO-OK
 leader_pid=208303, leader_lp!=NULL=1, num_lwps=11, zombie=0
 leader_pid=207755, leader_lp!=NULL=1, num_lwps=11, zombie=0
 LLW: exit (no unwaited-for LWP)
 Finished step over.

The step-over for 208305 finishes.

 Writing cc to 0x555555555227 in process 208305
 Could not find fast tracepoint jump at 0x555555555227 in list (reinserting).
 >>>> entering void linux_process_target::stop_all_lwps(int, lwp_info*)
 stop_all_lwps (stop, except=none)
 wait_for_sigstop: pulling events

The detach proceeds (snipped).

...

 proceed_one_lwp: lwp 208305
    LWP 208305 has pending status, leaving stopped

Later on, 208305 has a pending status (the step SIGTRAP from the
step-over), so GDBserver starts the process of reporting it.

...

 wait_1 ret = LWP 208303.208305, 1, 5
 <<<< exiting ptid_t linux_process_target::wait_1(ptid_t, target_waitstatus*, target_wait_flags)

...

and eventually GDB receives the stop notification (T05 == SIGTRAP):

 getpkt ("vStopped");  [no ack sent]
 sigchld_handler
 vStopped: acking 3
 Writing resume reply for LWP 208303.208305:1
 putpkt ("$T0506:f0ee58f7ff7f0* ;07:f0ee58f7ff7f0* ;10:2a525*"550* ;thread:p32daf.32db1;core:c;#37"); [noack mode]

From the GDB side, we see:

 [infrun] fetch_inferior_event: enter
   [infrun] fetch_inferior_event: fetch_inferior_event enter
   [infrun] do_target_wait: Found 2 inferiors, starting at #1
   [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
   [infrun] print_target_wait_results:   208303.208305.0 [Thread 208303.208305],
   [infrun] print_target_wait_results:   status->kind = stopped, signal = GDB_SIGNAL_TRAP
   [infrun] handle_inferior_event: status->kind = stopped, signal = GDB_SIGNAL_TRAP
   [infrun] start_step_over: enter
     [infrun] start_step_over: stealing global queue of threads to step, length = 6
     [infrun] operator(): putting back 6 threads to step in global queue
   [infrun] start_step_over: exit
   [infrun] handle_signal_stop: context switch
   [infrun] context_switch: Switching context from process 0 to Thread 208303.208305
   [infrun] handle_signal_stop: stop_pc=0x55555555522a
   [infrun] handle_signal_stop: random signal (GDB_SIGNAL_TRAP)
   [infrun] stop_waiting: stop_waiting
   [infrun] stop_all_threads: starting

The fix is to discard the step SIGTRAP, unless GDB wanted the thread
to step.

gdbserver/ChangeLog:

	* linux-low.cc (linux_process_target::complete_ongoing_step_over):
	Discard step SIGTRAP, unless GDB wanted the thread to step.
pipcet pushed a commit that referenced this issue Mar 28, 2021
…PR gdb/27147)

PR 27147 shows that on sparc64, GDB is unable to properly unwind:

Expected result (from GDB 9.2):

    #0  0x0000000000108de4 in puts ()
    #1  0x0000000000100950 in hello () at gdb-test.c:4
    #2  0x0000000000100968 in main () at gdb-test.c:8

Actual result (from GDB latest git):

    #0  0x0000000000108de4 in puts ()
    #1  0x0000000000100950 in hello () at gdb-test.c:4
    Backtrace stopped: previous frame inner to this frame (corrupt stack?)

The first failing commit is 5b6d1e4 ("Multi-target support").  The cause
of the change in behavior is due to (thanks for Andrew Burgess for finding
this):

 - inferior_ptid is no longer set on entry of target_ops::wait, whereas
   it was set to something valid previously
 - deep down in linux_nat_target::wait (see stack trace below), we fetch
   the registers of the event thread
 - on sparc64, fetching registers involves reading memory (in
   sparc_supply_rwindow, see stack trace below)
 - reading memory (target_ops::xfer_partial) relies on inferior_ptid
   being set to the thread from which we want to read memory

This is where things go wrong:

    #0  linux_nat_target::xfer_partial (this=0x10000fa2c40 <the_sparc64_linux_nat_target>, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x7feffe3b000 "", writebuf=0x0, offset=8791798050744, len=8, xfered_len=0x7feffe3ae88) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:3697
    #1  0x00000100007f5b10 in raw_memory_xfer_partial (ops=0x10000fa2c40 <the_sparc64_linux_nat_target>, readbuf=0x7feffe3b000 "", writebuf=0x0, memaddr=8791798050744, len=8, xfered_len=0x7feffe3ae88) at /home/simark/src/binutils-gdb/gdb/target.c:912
    #2  0x00000100007f60e8 in memory_xfer_partial_1 (ops=0x10000fa2c40 <the_sparc64_linux_nat_target>, object=TARGET_OBJECT_MEMORY, readbuf=0x7feffe3b000 "", writebuf=0x0, memaddr=8791798050744, len=8, xfered_len=0x7feffe3ae88) at /home/simark/src/binutils-gdb/gdb/target.c:1043
    #3  0x00000100007f61b4 in memory_xfer_partial (ops=0x10000fa2c40 <the_sparc64_linux_nat_target>, object=TARGET_OBJECT_MEMORY, readbuf=0x7feffe3b000 "", writebuf=0x0, memaddr=8791798050744, len=8, xfered_len=0x7feffe3ae88) at /home/simark/src/binutils-gdb/gdb/target.c:1072
    #4  0x00000100007f6538 in target_xfer_partial (ops=0x10000fa2c40 <the_sparc64_linux_nat_target>, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x7feffe3b000 "", writebuf=0x0, offset=8791798050744, len=8, xfered_len=0x7feffe3ae88) at /home/simark/src/binutils-gdb/gdb/target.c:1129
    #5  0x00000100007f7094 in target_read_partial (ops=0x10000fa2c40 <the_sparc64_linux_nat_target>, object=TARGET_OBJECT_MEMORY, annex=0x0, buf=0x7feffe3b000 "", offset=8791798050744, len=8, xfered_len=0x7feffe3ae88) at /home/simark/src/binutils-gdb/gdb/target.c:1375
    #6  0x00000100007f721c in target_read (ops=0x10000fa2c40 <the_sparc64_linux_nat_target>, object=TARGET_OBJECT_MEMORY, annex=0x0, buf=0x7feffe3b000 "", offset=8791798050744, len=8) at /home/simark/src/binutils-gdb/gdb/target.c:1415
    #7  0x00000100007f69d4 in target_read_memory (memaddr=8791798050744, myaddr=0x7feffe3b000 "", len=8) at /home/simark/src/binutils-gdb/gdb/target.c:1218
    bminor#8  0x0000010000758520 in sparc_supply_rwindow (regcache=0x10000fea4f0, sp=8791798050736, regnum=-1) at /home/simark/src/binutils-gdb/gdb/sparc-tdep.c:1960
    bminor#9  0x000001000076208c in sparc64_supply_gregset (gregmap=0x10000be3190 <sparc64_linux_ptrace_gregmap>, regcache=0x10000fea4f0, regnum=-1, gregs=0x7feffe3b230) at /home/simark/src/binutils-gdb/gdb/sparc64-tdep.c:1974
    bminor#10 0x0000010000751b64 in sparc_fetch_inferior_registers (regcache=0x10000fea4f0, regnum=80) at /home/simark/src/binutils-gdb/gdb/sparc-nat.c:170
    bminor#11 0x0000010000759d68 in sparc64_linux_nat_target::fetch_registers (this=0x10000fa2c40 <the_sparc64_linux_nat_target>, regcache=0x10000fea4f0, regnum=80) at /home/simark/src/binutils-gdb/gdb/sparc64-linux-nat.c:38
    bminor#12 0x00000100008146ec in target_fetch_registers (regcache=0x10000fea4f0, regno=80) at /home/simark/src/binutils-gdb/gdb/target.c:3287
    #13 0x00000100006a8c5c in regcache::raw_update (this=0x10000fea4f0, regnum=80) at /home/simark/src/binutils-gdb/gdb/regcache.c:584
    #14 0x00000100006a8d94 in readable_regcache::raw_read (this=0x10000fea4f0, regnum=80, buf=0x7feffe3b7c0 "") at /home/simark/src/binutils-gdb/gdb/regcache.c:598
    #15 0x00000100006a93b8 in readable_regcache::cooked_read (this=0x10000fea4f0, regnum=80, buf=0x7feffe3b7c0 "") at /home/simark/src/binutils-gdb/gdb/regcache.c:690
    #16 0x00000100006b288c in readable_regcache::cooked_read<unsigned long, void> (this=0x10000fea4f0, regnum=80, val=0x7feffe3b948) at /home/simark/src/binutils-gdb/gdb/regcache.c:777
    #17 0x00000100006a9b44 in regcache_cooked_read_unsigned (regcache=0x10000fea4f0, regnum=80, val=0x7feffe3b948) at /home/simark/src/binutils-gdb/gdb/regcache.c:791
    #18 0x00000100006abf3c in regcache_read_pc (regcache=0x10000fea4f0) at /home/simark/src/binutils-gdb/gdb/regcache.c:1295
    #19 0x0000010000507920 in save_stop_reason (lp=0x10000fc5b10) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:2612
    #20 0x00000100005095a4 in linux_nat_filter_event (lwpid=520983, status=1407) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:3050
    #21 0x0000010000509f9c in linux_nat_wait_1 (ptid=..., ourstatus=0x7feffe3c8f0, target_options=...) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:3194
    #22 0x000001000050b1d0 in linux_nat_target::wait (this=0x10000fa2c40 <the_sparc64_linux_nat_target>, ptid=..., ourstatus=0x7feffe3c8f0, target_options=...) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:3432
    #23 0x00000100007f8ac0 in target_wait (ptid=..., status=0x7feffe3c8f0, options=...) at /home/simark/src/binutils-gdb/gdb/target.c:2000
    #24 0x00000100004ac17c in do_target_wait_1 (inf=0x1000116d280, ptid=..., status=0x7feffe3c8f0, options=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:3464
    #25 0x00000100004ac3b8 in operator() (__closure=0x7feffe3c678, inf=0x1000116d280) at /home/simark/src/binutils-gdb/gdb/infrun.c:3527
    #26 0x00000100004ac7cc in do_target_wait (wait_ptid=..., ecs=0x7feffe3c8c8, options=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:3540
    #27 0x00000100004ad8c4 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:3880
    #28 0x0000010000485568 in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42
    #29 0x000001000050d394 in handle_target_event (error=0, client_data=0x0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:4060
    #30 0x0000010000ab5c8c in handle_file_event (file_ptr=0x10001207270, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:575
    #31 0x0000010000ab6334 in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:701
    #32 0x0000010000ab487c in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212
    #33 0x0000010000542668 in start_event_loop () at /home/simark/src/binutils-gdb/gdb/main.c:348
    #34 0x000001000054287c in captured_command_loop () at /home/simark/src/binutils-gdb/gdb/main.c:408
    #35 0x0000010000544e84 in captured_main (data=0x7feffe3d188) at /home/simark/src/binutils-gdb/gdb/main.c:1242
    #36 0x0000010000544f2c in gdb_main (args=0x7feffe3d188) at /home/simark/src/binutils-gdb/gdb/main.c:1257
    #37 0x00000100000c1f14 in main (argc=4, argv=0x7feffe3d548) at /home/simark/src/binutils-gdb/gdb/gdb.c:32

There is a target_read_memory call in sparc_supply_rwindow, whose return
value is not checked.  That call fails, because inferior_ptid does not
contain a valid ptid, and uninitialized buffer contents is used.
Ultimately it results in a corrupt stop_pc.

target_ops::fetch_registers can be (and should remain, in my opinion)
independent of inferior_ptid, because the ptid of the thread from which
to fetch registers can be obtained from the regcache.  In other words,
implementations of target_ops::fetch_registers should not rely on
inferior_ptid having a sensible value on entry.

The sparc64_linux_nat_target::fetch_registers case is special, because it calls
a target method that is dependent on the inferior_ptid value
(target_read_inferior, and ultimately target_ops::xfer_partial).  So I would
say it's the responsibility of sparc64_linux_nat_target::fetch_registers to set
up inferior_ptid correctly prior to calling target_read_inferior.

This patch makes sparc64_linux_nat_target::fetch_registers (and
store_registers, since it works the same) temporarily set inferior_ptid.  If we
ever make target_ops::xfer_partial independent of inferior_ptid, setting
inferior_ptid won't be necessary, we'll simply pass down the ptid as a
parameter in some way.

I chose to set/restore inferior_ptid in sparc_fetch_inferior_registers, because
I am not convinced that doing so in an inner location (in sparc_supply_rwindow
for instance) would always be correct.  We have access to the ptid in
sparc_supply_rwindow (from the regcache), so we _could_ set inferior_ptid
there.  However, I don't want to just set inferior_ptid, as that would make it
not desync'ed with `current_thread ()` and `current_inferior ()`.  It's
preferable to use switch_to_thread instead, as that switches all the global
"current" stuff in a coherent way.  But doing so requires a `thread_info *`,
and getting a `thread_info *` from a ptid requires a `process_stratum_target
*`.  We could use `current_inferior()->process_target()` in
sparc_supply_rwindow for this (using target_read_memory uses the current
inferior's target stack anyway).  However, sparc_supply_rwindow is also used in
the context of BSD uthreads, where a thread stratum target defines threads.  I
presume the ptid in the regcache would be the ptid of the uthread, defined by
the thread stratum target (bsd_uthread_target).  Using
`current_inferior()->process_target()` would look up a ptid defined by the
thread stratum target using the process stratum target.  I don't think it would
give good results.  So I prefer playing it safe and looking up the thread
earlier, in sparc_fetch_inferior_registers.

I added some assertions (in sparc_supply_rwindow and others) to verify
that the regcache's ptid matches inferior_ptid.  That verifies that the
caller has properly set the correct global context.  This would have
caught (though a failed assertion) the current problem.

gdb/ChangeLog:

	PR gdb/27147
	* sparc-nat.h (sparc_fetch_inferior_registers): Add
	process_stratum_target parameter,
	sparc_store_inferior_registers): update callers.
	* sparc-nat.c (sparc_fetch_inferior_registers,
	sparc_store_inferior_registers): Add process_stratum_target
	parameter.  Switch current thread before calling
	sparc_supply_gregset / sparc_collect_rwindow.
	(sparc_store_inferior_registers): Likewise.
	* sparc-obsd-tdep.c (sparc32obsd_supply_uthread): Add assertion.
	(sparc32obsd_collect_uthread): Likewise.
	* sparc-tdep.c (sparc_supply_rwindow, sparc_collect_rwindow):
	Add assertion.
	* sparc64-obsd-tdep.c (sparc64obsd_collect_uthread,
	sparc64obsd_supply_uthread): Add assertion.

Change-Id: I16c658cd70896cea604516714f7e2428fbaf4301
pipcet pushed a commit that referenced this issue Mar 28, 2021
When testing with "maint set target-non-stop on",
gdb.server/bkpt-other-inferior.exp sometimes fails like so:

 (gdb) inferior 2
 [Switching to inferior 2 [process 368191] (<noexec>)]
 [Switching to thread 2.1 (Thread 368191.368191)]
 [remote] Sending packet: $m7ffff7fd0100,1#5b
 [remote] Packet received: 48
 [remote] Sending packet: $m7ffff7fd0100,1#5b
 [remote] Packet received: 48
 [remote] Sending packet: $m7ffff7fd0100,9#63
 [remote] Packet received: 4889e7e8e80c000049
 #0  0x00007ffff7fd0100 in ?? ()
 (gdb) PASS: gdb.server/bkpt-other-inferior.exp: inf 2: switch to inferior
 break -q main
 Breakpoint 2 at 0x1138: file /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.server/server.c, line 21.
 (gdb) PASS: gdb.server/bkpt-other-inferior.exp: inf 2: set breakpoint
 delete breakpoints
 Delete all breakpoints? (y or n) y
 (gdb) [remote] wait: enter
 [remote] wait: exit
 FAIL: gdb.server/bkpt-other-inferior.exp: inf 2: delete all breakpoints in delete_breakpoints (timeout)
 ERROR: breakpoints not deleted
 Remote debugging from host ::1, port 55876
 monitor exit

The problem is here:

 (gdb) [remote] wait: enter

The testcase isn't expecting any output after the prompt.

Why is that "[remote] wait" output?  What happens is that "delete
breakpoints" queries the user, and `query` disables/reenables target
async, which results in the remote target's async event handler ending
up marked:

 (top-gdb) bt
 #0  mark_async_event_handler (async_handler_ptr=0x556bffffffff) at ../../src/gdb/async-event.c:295
 #1  0x0000556bf71b711f in infrun_async (enable=1) at ../../src/gdb/infrun.c:119
 #2  0x0000556bf7471387 in target_async (enable=1) at ../../src/gdb/target.c:3684
 #3  0x0000556bf748a0bd in gdb_readline_wrapper_cleanup::~gdb_readline_wrapper_cleanup (this=0x7ffe3cf30eb0, __in_chrg=<optimized out>) at ../../src/gdb/top.c:1074
 #4  0x0000556bf74874e2 in gdb_readline_wrapper (prompt=0x556bfa17da60 "Delete all breakpoints? (y or n) ") at ../../src/gdb/top.c:1096
 #5  0x0000556bf75111c5 in defaulted_query(const char *, char, typedef __va_list_tag __va_list_tag *) (ctlstr=0x556bf7717f34 "Delete all breakpoints? ", defchar=0 '\000', args=0x7ffe3cf31020) at ../../src/gdb/utils.c:893
 #6  0x0000556bf751166f in query (ctlstr=0x556bf7717f34 "Delete all breakpoints? ") at ../../src/gdb/utils.c:985
 #7  0x0000556bf6f11404 in delete_command (arg=0x0, from_tty=1) at ../../src/gdb/breakpoint.c:13500
 ...

... which then later results in a target_wait call:

 (top-gdb) bt
 #0  remote_target::wait_ns (this=0x7ffe3cf30f80, ptid=..., status=0xde530314f0802800, options=...) at ../../src/gdb/remote.c:7937
 #1  0x0000556bf7369dcb in remote_target::wait (this=0x556bfa0b2180, ptid=..., status=0x7ffe3cf31568, options=...) at ../../src/gdb/remote.c:8173
 #2  0x0000556bf745e527 in target_wait (ptid=..., status=0x7ffe3cf31568, options=...) at ../../src/gdb/target.c:2000
 #3  0x0000556bf71be686 in do_target_wait_1 (inf=0x556bfa1573d0, ptid=..., status=0x7ffe3cf31568, options=...) at ../../src/gdb/infrun.c:3463
 #4  0x0000556bf71be88b in <lambda(inferior*)>::operator()(inferior *) const (__closure=0x7ffe3cf31320, inf=0x556bfa1573d0) at ../../src/gdb/infrun.c:3526
 #5  0x0000556bf71bebcd in do_target_wait (wait_ptid=..., ecs=0x7ffe3cf31540, options=...) at ../../src/gdb/infrun.c:3539
 #6  0x0000556bf71bf97b in fetch_inferior_event () at ../../src/gdb/infrun.c:3879
 #7  0x0000556bf71a27f8 in inferior_event_handler (event_type=INF_REG_EVENT) at ../../src/gdb/inf-loop.c:42
 bminor#8  0x0000556bf71cc8b7 in infrun_async_inferior_event_handler (data=0x0) at ../../src/gdb/infrun.c:9220
 bminor#9  0x0000556bf6ecb80f in check_async_event_handlers () at ../../src/gdb/async-event.c:327
 bminor#10 0x0000556bf76b011a in gdb_do_one_event () at ../../src/gdbsupport/event-loop.cc:216
 ...

... which returns TARGET_WAITKIND_IGNORE.

Fix this by only enabling remote output around setting the breakpoint.

gdb/testsuite/ChangeLog:

	* gdb.server/bkpt-other-inferior.exp: Only enable remote output
	around setting the breakpoint.

Change-Id: I2fd152fd9c46b1c5e7fa678cc4d4054dac0b2bd4
pipcet pushed a commit that referenced this issue Jun 17, 2021
When building with AddressSanitizer, sim/m32c fails with:

./opc2c -l r8c.out /home/simark/src/binutils-gdb/sim/m32c/r8c.opc > r8c.c
sim_log: r8c.out

=================================================================
==3919390==ERROR: LeakSanitizer: detected memory leaks

    Direct leak of 4 byte(s) in 1 object(s) allocated from:
        #0 0x7ffff7677459 in __interceptor_malloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:145
        #1 0x55555555b3df in main /home/simark/src/binutils-gdb/sim/m32c/opc2c.c:658
        #2 0x7ffff741fb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Fix the leak in main by removing the vlist variable, which seems unused.
pipcet pushed a commit that referenced this issue Jun 17, 2021
This commit replaces this patch:

  https://sourceware.org/pipermail/gdb-patches/2021-January/174933.html

which was itself a replacement for this patch:

  https://sourceware.org/pipermail/gdb-patches/2020-July/170335.html

The motivation behind the original patch can be seen in the new test,
which currently gives a GDB session like this:

  (gdb) ptype var8
  type = Type type6
      PTR TO -> ( Type type2 :: ptr_1 )
      PTR TO -> ( Type type2 :: ptr_2 )
  End Type type6
  (gdb) ptype var8%ptr_2
  type = PTR TO -> ( Type type2
      integer(kind=4) :: spacer
      Type type1, allocatable :: t2_array(:)	<------ Issue #1
  End Type type2 )
  (gdb) ptype var8%ptr_2%t2_array
  Cannot access memory at address 0x38		<------ Issue #2
  (gdb)

Issue #1: Here we see the abstract dynamic type, rather than the
resolved concrete type.  Though in some cases the user might be
interested in the abstract dynamic type, I think that in most cases
showing the resolved concrete type will be of more use.  Plus, the
user can always figure out the dynamic type (by source code inspection
if nothing else) given the concrete type, but it is much harder to
figure out the concrete type given only the dynamic type.

Issue #2: In this example, GDB evaluates the expression in
EVAL_AVOID_SIDE_EFFECTS mode (due to ptype).  The value returned for
var8%ptr_2 will be a non-lazy, zero value of the correct dynamic
type.  However, when GDB asks about the type of t2_array this requires
GDB to access the value of var8%ptr_2 in order to read the dynamic
properties.  As this value was forced to zero (thanks to the use of
EVAL_AVOID_SIDE_EFFECTS) then GDB ends up accessing memory at a base
of zero plus some offset.

Both this patch, and my previous two attempts, have all tried to
resolve this problem by stopping EVAL_AVOID_SIDE_EFFECTS replacing the
result value with a zero value in some cases.

This new patch is influenced by how Ada handles its tagged typed.
There are plenty of examples in ada-lang.c, but one specific case is
ada_structop_operation::evaluate.  When GDB spots that we are dealing
with a tagged (dynamic) type, and we're in EVAL_AVOID_SIDE_EFFECTS
mode, then GDB re-evaluates the child operation in EVAL_NORMAL mode.

This commit handles two cases like this specifically for Fortran, a
new fortran_structop_operation, and the already existing
fortran_undetermined, which is where we handle array accesses.

In these two locations we spot when we are dealing with a dynamic type
and re-evaluate the child operation in EVAL_NORMAL mode so that we
are able to access the dynamic properties of the type.

The rest of this commit message is my attempt to record why my
previous patches failed.

To understand my second patch, and why it failed lets consider two
expressions, this Fortran expression:

  (gdb) ptype var8%ptr_2%t2_array	--<A>
  Operation: STRUCTOP_STRUCT		--(1)
   Operation: STRUCTOP_STRUCT		--(2)
    Operation: OP_VAR_VALUE		--(3)
     Symbol: var8
     Block: 0x3980ac0
    String: ptr_2
   String: t2_array

And this C expression:

  (gdb) ptype ptr && ptr->a == 3	--<B>
  Operation: BINOP_LOGICAL_AND		--(4)
   Operation: OP_VAR_VALUE		--(5)
    Symbol: ptr
    Block: 0x45a2a00
   Operation: BINOP_EQUAL		--(6)
    Operation: STRUCTOP_PTR		--(7)
     Operation: OP_VAR_VALUE		--(8)
      Symbol: ptr
      Block: 0x45a2a00
     String: a
    Operation: OP_LONG			--(9)
     Type: int
     Constant: 0x0000000000000003

In expression <A> we should assume that t2_array is of dynamic type.
Nothing has dynamic type in expression <B>.

This is how GDB currently handles expression <A>, in all cases,
EVAL_AVOID_SIDE_EFFECTS or EVAL_NORMAL, an OP_VAR_VALUE operation
always returns the real value of the symbol, this is not forced to a
zero value even in EVAL_AVOID_SIDE_EFFECTS mode.  This means that (3),
(5), and (8) will always return a real lazy value for the symbol.

However a STRUCTOP_STRUCT will always replace its result with a
non-lazy, zero value with the same type as its result.  So (2) will
lookup the field ptr_2 and create a zero value with that type.  In
this case the type is a pointer to a dynamic type.

Then, when we evaluate (1) to figure out the resolved type of
t2_array, we need to read the types dynamic properties.  These
properties are stored in memory relative to the objects base address,
and the base address is in var8%ptr_2, which we already figured out
has the value zero.  GDB then evaluates the DWARF expressions that
take the base address, add an offset and dereference.  GDB then ends
up trying to access addresses like 0x16, 0x8, etc.

To fix this, I proposed changing STRUCTOP_STRUCT so that instead of
returning a zero value we instead returned the actual value
representing the structure's field in the target.  My thinking was
that GDB would not try to access the value's contents unless it needed
it to resolve a dynamic type.  This belief was incorrect.

Consider expression <B>.  We already know that (5) and (8) will return
real values for the symbols being referenced.  The BINOP_LOGICAL_AND,
operation (4) will evaluate both of its children in
EVAL_AVOID_SIDE_EFFECTS in order to get the types, this is required
for C++ operator lookup.  This means that even if the value of (5)
would result in the BINOP_LOGICAL_AND returning false (say, ptr is
NULL), we still evaluate (6) in EVAL_AVOID_SIDE_EFFECTS mode.

Operation (6) will evaluate both children in EVAL_AVOID_SIDE_EFFECTS
mode, operation (9) is easy, it just returns a value with the constant
packed into it, but (7) is where the problem lies.  Currently in GDB
this STRUCTOP_STRUCT will always return a non-lazy zero value of the
correct type.

When the results of (7) and (9) are back in the BINOP_LOGICAL_AND
operation (6), the two values are passed to value_equal which performs
the comparison and returns a result.  Note, the two things compared
here are the immediate value (9), and a non-lazy zero value from (7).

However, with my proposed patch operation (7) no longer returns a zero
value, instead it returns a lazy value representing the actual value
in target memory.  When we call value_equal in (6) this code causes
GDB to try and fetch the actual value from target memory.  If `ptr` is
NULL then this will cause GDB to access some invalid address at an
offset from zero, this will most likely fail, and cause GDB to throw
an error instead of returning the expected type.

And so, we can now describe the problem that we're facing.  The way
GDB's expression evaluator is currently written we assume, when in
EVAL_AVOID_SIDE_EFFECTS mode, that any value returned from a child
operation can safely have its content read without throwing an
error.  If child operations start returning real values (instead of
the fake zero values), then this is simply not true.

If we wanted to work around this then we would need to rewrite almost
all operations (I would guess) so that EVAL_AVOID_SIDE_EFFECTS mode
does not cause evaluation of an operation to try and read the value of
a child operation.  As an example, consider this current GDB code from
eval.c:

  struct value *
  eval_op_equal (struct type *expect_type, struct expression *exp,
  	       enum noside noside, enum exp_opcode op,
  	       struct value *arg1, struct value *arg2)
  {
    if (binop_user_defined_p (op, arg1, arg2))
      {
        return value_x_binop (arg1, arg2, op, OP_NULL, noside);
      }
    else
      {
        binop_promote (exp->language_defn, exp->gdbarch, &arg1, &arg2);
        int tem = value_equal (arg1, arg2);
        struct type *type = language_bool_type (exp->language_defn,
  					      exp->gdbarch);
        return value_from_longest (type, (LONGEST) tem);
      }
  }

We could change this function to be this:

  struct value *
  eval_op_equal (struct type *expect_type, struct expression *exp,
  	       enum noside noside, enum exp_opcode op,
  	       struct value *arg1, struct value *arg2)
  {
    if (binop_user_defined_p (op, arg1, arg2))
      {
        return value_x_binop (arg1, arg2, op, OP_NULL, noside);
      }
    else
      {
        struct type *type = language_bool_type (exp->language_defn,
  					      exp->gdbarch);
        if (noside == EVAL_AVOID_SIDE_EFFECTS)
  	  return value_zero (type, VALUE_LVAL (arg1));
        else
  	{
  	  binop_promote (exp->language_defn, exp->gdbarch, &arg1, &arg2);
  	  int tem = value_equal (arg1, arg2);
  	  return value_from_longest (type, (LONGEST) tem);
  	}
      }
  }

Now we don't call value_equal unless we really need to.  However, we
would need to make the same, or similar change to almost all
operations, which would be a big task, and might not be a direction we
wanted to take GDB in.

So, for now, I'm proposing we go with the more targeted, Fortran
specific solution, that does the minimal required in order to
correctly resolve the dynamic types.

gdb/ChangeLog:

	* f-exp.h (class fortran_structop_operation): New class.
	* f-exp.y (exp): Create fortran_structop_operation instead of the
	generic structop_operation.
	* f-lang.c (fortran_undetermined::evaluate): Re-evaluate
	expression as EVAL_NORMAL if the result type was dynamic so we can
	extract the actual array bounds.
	(fortran_structop_operation::evaluate): New function.

gdb/testsuite/ChangeLog:

	* gdb.fortran/dynamic-ptype-whatis.exp: New file.
	* gdb.fortran/dynamic-ptype-whatis.f90: New file.
pipcet pushed a commit that referenced this issue Jun 17, 2021
While working on some changes to 'info sources' I ran into a situation
where I was seeing the same source files reported twice in the output
of the 'info sources' command when using either .gdb_index or the
.debug_name index.

I traced the problem back to some caching in
dwarf2_base_index_functions::map_symbol_filenames; when called GDB
caches the set of filenames, but, filesnames are not removed as the
index entries are expanded into full symtabs.  As a result we can end
up seeing filenames reported both from a full symtab _and_ from
a (stale) previously cached index entry.

Now, obviously, when seeing a problem like this the "correct" fix is
to remove the stale entries from the cache, however, I ran a few
experiments to see why this wasn't really hitting us anywhere, and, as
far as I can tell, ::map_symbol_filenames is only called from three
places:

  1. The mi command -file-list-exec-source-files,
  2. The 'info sources' command, and
  3. Filename completion

However, the result of this "bug" is that we will see duplicate
filenames, and readline's completion mechanism already removes
duplicates, so for case #3 we will never see any problems.

Cases #1 and #2 are basically the same, and in each case, to see a
problem we need to ensure we craft the test in a particular way, start
up ensuring we have some unexpected symtabs, then run one of the
commands to populate the cache, then expand one of the symtabs, and
list the sources again.  At this point you'll see duplicate entries in
the results.  Hardly surprising we haven't randomly hit this situation
in testing.

So, considering that use cases #1 and #2 are certainly not "high
performance" code (i.e. I don't think these justify the need for
caching) this leaves use case #3.  Does this use justify the need for
caching?  Well the psymbol_functions::map_symbol_filenames function
doesn't seem to do any extra caching, and within
dwarf2_base_index_functions::map_symbol_filenames, the only expensive
bit appears to be the call to dw2_get_file_names, and this already
does its own caching via this_cu->v.quick->file_names.

The upshot of all this analysis was that I'm not convinced the need
for the additional caching is justified, and so, I propose that to fix
the bug in GDB, I just remove the extra caching (for now).

If we later find that the caching _was_ useful, then we can
reintroduce it, but add it back such that it doesn't reintroduce this
bug.

As I was changing dwarf2_base_index_functions::map_symbol_filenames I
replaced the use of htab_up with std::unordered_set.

Tested using target_boards cc-with-debug-names and dwarf4-gdb-index.

gdb/ChangeLog:

	* dwarf2/read.c: Add 'unordered_set' include.
	(dwarf2_base_index_functions::map_symbol_filenames): Replace
	'visited' hash table with 'qfn_cache' unordered_set.  Remove use
	of per_Bfd->filenames_cache cache, and use function local
	filenames_cache instead.  Reindent.
	* dwarf2/read.h (struct dwarf2_per_bfd) <filenames_cache>: Delete.

gdb/testsuite/ChangeLog:

	* gdb.base/info_sources.exp: Add new tests.
pipcet pushed a commit that referenced this issue Jun 17, 2021
When loading the debug info package
libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug from openSUSE Leap 15.2, we
run into a dwarf error:
...
$ gdb -q -batch libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug
Dwarf Error: Cannot not find DIE at 0x18a936e7 \
  [from module libLLVM.so.10-10.0.1-lp152.30.4.x86_64.debug]
...
The DIE @ 0x18a936e7 does in fact exist, and is part of a CU @ 0x18a23e52.
No error message is printed when using -readnow.

What happens is the following:
- a dwarf2_per_cu_data P is created for the CU.
- a dwarf2_cu A is created for the same CU.
- another dwarf2_cu B is created for the same CU.
- the dwarf2_cu B is set in per_objfile->m_dwarf2_cus, such that
  per_objfile->get_cu (P) returns B.
- P->load_all_dies is set to 1.
- all dies are read into the A->partial_dies htab
- dwarf2_cu A is destroyed.
- we try to find the partial_die for the DIE @ 0x18a936e7 in B->partial_dies.
  We can't find it, but do not try to load all dies, because P->load_all_dies
  is already set to 1.
- an error message is generated.

The question is why we're creating dwarf2_cu A and B for the same CU.

The dwarf2_cu A is created here:
...
 (gdb) bt
 #0  dwarf2_cu::dwarf2_cu (this=0x79a9660, per_cu=0x23c0b30,
     per_objfile=0x1ad01b0) at dwarf2/cu.c:38
 #1  0x0000000000675799 in cutu_reader::cutu_reader (this=0x7fffffffd040,
     this_cu=0x23c0b30, per_objfile=0x1ad01b0, abbrev_table=0x0,
     existing_cu=0x0, skip_partial=false) at dwarf2/read.c:6487
 #2  0x0000000000676eb3 in process_psymtab_comp_unit (this_cu=0x23c0b30,
      per_objfile=0x1ad01b0, want_partial_unit=false,
      pretend_language=language_minimal) at dwarf2/read.c:7028
...

And the dwarf2_cu B is created here:
...
 (gdb) bt
 #0  dwarf2_cu::dwarf2_cu (this=0x885e8c0, per_cu=0x23c0b30,
     per_objfile=0x1ad01b0) at dwarf2/cu.c:38
 #1  0x0000000000675799 in cutu_reader::cutu_reader (this=0x7fffffffcc50,
     this_cu=0x23c0b30, per_objfile=0x1ad01b0, abbrev_table=0x0,
     existing_cu=0x0, skip_partial=false) at dwarf2/read.c:6487
 #2  0x0000000000678118 in load_partial_comp_unit (this_cu=0x23c0b30,
     per_objfile=0x1ad01b0, existing_cu=0x0) at dwarf2/read.c:7436
 #3  0x000000000069721d in find_partial_die (sect_off=(unknown: 0x18a55054),
     offset_in_dwz=0, cu=0x0) at dwarf2/read.c:19391
 #4  0x000000000069755b in partial_die_info::fixup (this=0x9096900,
     cu=0xa6a85f0) at dwarf2/read.c:19512
 #5  0x0000000000697586 in partial_die_info::fixup (this=0x8629bb0,
     cu=0xa6a85f0) at dwarf2/read.c:19516
 #6  0x00000000006787b1 in scan_partial_symbols (first_die=0x8629b40,
     lowpc=0x7fffffffcf58, highpc=0x7fffffffcf50, set_addrmap=0, cu=0x79a9660)
     at dwarf2/read.c:7563
 #7  0x0000000000678878 in scan_partial_symbols (first_die=0x796ebf0,
     lowpc=0x7fffffffcf58, highpc=0x7fffffffcf50, set_addrmap=0, cu=0x79a9660)
     at dwarf2/read.c:7580
 bminor#8  0x0000000000676b82 in process_psymtab_comp_unit_reader
     (reader=0x7fffffffd040, info_ptr=0x7fffc1b3f29b, comp_unit_die=0x6ea90f0,
     pretend_language=language_minimal) at dwarf2/read.c:6954
 bminor#9  0x0000000000676ffd in process_psymtab_comp_unit (this_cu=0x23c0b30,
     per_objfile=0x1ad01b0, want_partial_unit=false,
     pretend_language=language_minimal) at dwarf2/read.c:7057
...

So in frame bminor#9, a cutu_reader is created with dwarf2_cu A.  Then a fixup takes
us to the following CU @ 0x18aa33d6, in frame #5.  And a similar fixup in
frame #4 takes us back to CU @ 0x18a23e52.  At that point, there's no
information available that we're already trying to read that CU, and we end up
creating another cutu_reader with dwarf2_cu B.

It seems that there are two related problems:
- creating two dwarf2_cu's is not optimal
- the unoptimal case is not handled correctly

This patch addresses the last problem, by moving the load_all_dies flag from
dwarf2_per_cu_data to dwarf2_cu, such that it is paired with the partial_dies
field, which ensures that the two can be kept in sync.

Tested on x86_64-linux.

gdb/ChangeLog:

2021-05-27  Tom de Vries  <tdevries@suse.de>

	PR symtab/27898
	* dwarf2/cu.c (dwarf2_cu::dwarf2_cu): Add load_all_dies init.
	* dwarf2/cu.h (dwarf2_cu): Add load_all_dies field.
	* dwarf2/read.c (load_partial_dies, find_partial_die): Update.
	* dwarf2/read.h (dwarf2_per_cu_data::dwarf2_per_cu_data): Remove
	load_all_dies init.
	(dwarf2_per_cu_data): Remove load_all_dies field.
pipcet pushed a commit that referenced this issue Jun 17, 2021
Building GDB with current git (future 13) Clang runs into these two
issues:

#1:

 src/gdb/symtab.h:1139:3: error: definition of implicit copy assignment operator for 'symbol' is deprecated because it has a user-declared copy constructor [-Werror,-Wdeprecated-copy]
   symbol (const symbol &) = default;
   ^

#2:

 src/gdb/dwarf2/read.c:834:23: error: definition of implicit copy constructor for 'partial_die_info' is deprecated because it has a user-declared copy assignment operator [-Werror,-Wdeprecated-copy]
     partial_die_info& operator=(const partial_die_info& rhs) = delete;
		       ^

Fix them by adding the explicit defaulted versions of copy ctor and
copy-assign op appropriately.

gdb/ChangeLog:
yyyy-mm-dd  Pedro Alves  <pedro@palves.net>

	* dwarf2/read.c (struct partial_die_info): Add defaulted copy
	ctor.
	* symtab.h (struct symbol): Add defaulted copy assignment
	operator.
pipcet pushed a commit that referenced this issue Jun 17, 2021
… when attaching / handling a fork child

When trying to attach to a pthread process on a Linux system with glibc 2.33,
we get:

    $ ./gdb -q -nx --data-directory=data-directory -p 1472010
    Attaching to process 1472010
    [New LWP 1472013]
    [New LWP 1472014]
    [New LWP 1472015]
    Error while reading shared library symbols for /usr/lib/libpthread.so.0:
    Cannot find user-level thread for LWP 1472015: generic error
    0x00007ffff6d3637f in poll () from /usr/lib/libc.so.6
    (gdb)

When attaching to a process (or handling a fork child, an operation very
similar to attaching), GDB reads the shared library list from the
process.  For each shared library (if "set auto-solib-add" is on), it
reads its symbols and calls the "new_objfile" observable.

The libthread-db code monitors this observable, and if it sees an
objfile named somewhat like "libpthread.so" go by, it tries to load
libthread_db.so in the GDB process itself.  libthread_db knows how to
navigate libpthread's data structures to get information about the
existing threads.

To locate these data structures, libthread_db calls ps_pglobal_lookup
(implemented in proc-service.c), passing in a symbol name and expecting
an address in return.

Before glibc 2.33, libthread_db always asked for symbols found in
libpthread.  There was no ordering problem: since we were always trying
to load libthread_db in reaction to processing libpthread (and reading
in its symbols) and libthread_db only asked symbols from libpthread, the
requested symbols could always be found.  Starting with glibc 2.33,
libthread_db now asks for a symbol name that can be found in
/lib/ld-linux-x86-64.so.2 (_rtld_global).  And the ordering in which GDB
reads the shared libraries from the inferior when attaching is
unfortunate, in that libpthread is processed before ld-linux.  So when
loading libthread_db in reaction to processing libpthread, and
libthread_db requests the symbol that is from ld-linux, GDB is not yet
able to supply it.

That problematic symbol lookup happens in the thread_from_lwp function,
when we call td_ta_map_lwp2thr_p, and an exception is thrown at this
point:

    #0  0x00007ffff6681012 in __cxxabiv1::__cxa_throw (obj=0x60e000006100, tinfo=0x555560033b50 <typeinfo for gdb_exception_error>, dest=0x55555d9404bc <gdb_exception_error::~gdb_exception_error()>) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:78
    #1  0x000055555e5d3734 in throw_it(return_reason, errors, const char *, typedef __va_list_tag __va_list_tag *) (reason=RETURN_ERROR, error=GENERIC_ERROR, fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", ap=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdbsupport/common-exceptions.cc:200
    #2  0x000055555e5d37d4 in throw_verror (error=GENERIC_ERROR, fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", ap=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdbsupport/common-exceptions.cc:208
    #3  0x000055555e0b0ed2 in verror (string=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", args=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdb/utils.c:171
    #4  0x000055555e5e898a in error (fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s") at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:43
    #5  0x000055555d06b4bc in thread_from_lwp (stopped=0x617000035d80, ptid=...) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:418
    #6  0x000055555d07040d in try_thread_db_load_1 (info=0x60c000011140) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:912
    #7  0x000055555d071103 in try_thread_db_load (library=0x55555f0c62a0 "libthread_db.so.1", check_auto_load_safe=false) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1014
    bminor#8  0x000055555d072168 in try_thread_db_load_from_sdir () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1091
    bminor#9  0x000055555d072d1c in thread_db_load_search () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1146
    bminor#10 0x000055555d07365c in thread_db_load () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1203
    bminor#11 0x000055555d07373e in check_for_thread_db () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1246
    bminor#12 0x000055555d0738ab in thread_db_new_objfile (objfile=0x61300000c0c0) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1275
    #13 0x000055555bd10740 in std::__invoke_impl<void, void (*&)(objfile*), objfile*> (__f=@0x616000068d88: 0x55555d073745 <thread_db_new_objfile(objfile*)>) at /usr/include/c++/10.2.0/bits/invoke.h:60
    #14 0x000055555bd02096 in std::__invoke_r<void, void (*&)(objfile*), objfile*> (__fn=@0x616000068d88: 0x55555d073745 <thread_db_new_objfile(objfile*)>) at /usr/include/c++/10.2.0/bits/invoke.h:153
    #15 0x000055555bce0392 in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7fffffffb4a0: 0x61300000c0c0) at /usr/include/c++/10.2.0/bits/std_function.h:291
    #16 0x000055555d3595c0 in std::function<void (objfile*)>::operator()(objfile*) const (this=0x616000068d88, __args#0=0x61300000c0c0) at /usr/include/c++/10.2.0/bits/std_function.h:622
    #17 0x000055555d356b7f in gdb::observers::observable<objfile*>::notify (this=0x555566727020 <gdb::observers::new_objfile>, args#0=0x61300000c0c0) at /home/simark/src/binutils-gdb/gdb/../gdbsupport/observable.h:106
    #18 0x000055555da3f228 in symbol_file_add_with_addrs (abfd=0x61200001ccc0, name=0x6190000d9090 "/usr/lib/libpthread.so.0", add_flags=..., addrs=0x7fffffffbc10, flags=..., parent=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1131
    #19 0x000055555da3f763 in symbol_file_add_from_bfd (abfd=0x61200001ccc0, name=0x6190000d9090 "/usr/lib/libpthread.so.0", add_flags=<error reading variable: Cannot access memory at address 0xffffffffffffffb0>, addrs=0x7fffffffbc10, flags=<error reading variable: Cannot access memory at address 0xffffffffffffffc0>, parent=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1167
    #20 0x000055555d95f9fa in solib_read_symbols (so=0x6190000d8e80, flags=...) at /home/simark/src/binutils-gdb/gdb/solib.c:681
    #21 0x000055555d96233d in solib_add (pattern=0x0, from_tty=0, readsyms=1) at /home/simark/src/binutils-gdb/gdb/solib.c:987
    #22 0x000055555d93646e in enable_break (info=0x608000008f20, from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:2238
    #23 0x000055555d93cfc0 in svr4_solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:3049
    #24 0x000055555d96610d in solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib.c:1195
    #25 0x000055555cdee318 in post_create_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:318
    #26 0x000055555ce00e6e in setup_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:2439
    #27 0x000055555ce59c34 in handle_one (event=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:4887
    #28 0x000055555ce5cd00 in stop_all_threads () at /home/simark/src/binutils-gdb/gdb/infrun.c:5064
    #29 0x000055555ce7f0da in stop_waiting (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:8006
    #30 0x000055555ce67f5c in handle_signal_stop (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:6062
    #31 0x000055555ce63653 in handle_inferior_event (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:5727
    #32 0x000055555ce4f297 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4105
    #33 0x000055555cdbe3bf in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42
    #34 0x000055555d018047 in handle_target_event (error=0, client_data=0x0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:4060
    #35 0x000055555e5ea77e in handle_file_event (file_ptr=0x60600008b1c0, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:575
    #36 0x000055555e5eb09c in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:701
    #37 0x000055555e5e8d19 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212
    #38 0x000055555dd6e0d4 in wait_sync_command_done () at /home/simark/src/binutils-gdb/gdb/top.c:528
    #39 0x000055555dd6e372 in maybe_wait_sync_command_done (was_sync=0) at /home/simark/src/binutils-gdb/gdb/top.c:545
    #40 0x000055555d0ec7c8 in catch_command_errors (command=0x55555ce01bb8 <attach_command(char const*, int)>, arg=0x7fffffffe28d "1472010", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:452
    #41 0x000055555d0f03ad in captured_main_1 (context=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1149
    #42 0x000055555d0f1239 in captured_main (data=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1232
    #43 0x000055555d0f1315 in gdb_main (args=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1257
    #44 0x000055555bb70cf9 in main (argc=7, argv=0x7fffffffde88) at /home/simark/src/binutils-gdb/gdb/gdb.c:32

The exception is caught here:

    #0  __cxxabiv1::__cxa_begin_catch (exc_obj_in=0x60e0000060e0) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_catch.cc:84
    #1  0x000055555d95fded in solib_read_symbols (so=0x6190000d8e80, flags=...) at /home/simark/src/binutils-gdb/gdb/solib.c:689
    #2  0x000055555d96233d in solib_add (pattern=0x0, from_tty=0, readsyms=1) at /home/simark/src/binutils-gdb/gdb/solib.c:987
    #3  0x000055555d93646e in enable_break (info=0x608000008f20, from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:2238
    #4  0x000055555d93cfc0 in svr4_solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:3049
    #5  0x000055555d96610d in solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib.c:1195
    #6  0x000055555cdee318 in post_create_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:318
    #7  0x000055555ce00e6e in setup_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:2439
    bminor#8  0x000055555ce59c34 in handle_one (event=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:4887
    bminor#9  0x000055555ce5cd00 in stop_all_threads () at /home/simark/src/binutils-gdb/gdb/infrun.c:5064
    bminor#10 0x000055555ce7f0da in stop_waiting (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:8006
    bminor#11 0x000055555ce67f5c in handle_signal_stop (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:6062
    bminor#12 0x000055555ce63653 in handle_inferior_event (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:5727
    #13 0x000055555ce4f297 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4105
    #14 0x000055555cdbe3bf in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42
    #15 0x000055555d018047 in handle_target_event (error=0, client_data=0x0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:4060
    #16 0x000055555e5ea77e in handle_file_event (file_ptr=0x60600008b1c0, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:575
    #17 0x000055555e5eb09c in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:701
    #18 0x000055555e5e8d19 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212
    #19 0x000055555dd6e0d4 in wait_sync_command_done () at /home/simark/src/binutils-gdb/gdb/top.c:528
    #20 0x000055555dd6e372 in maybe_wait_sync_command_done (was_sync=0) at /home/simark/src/binutils-gdb/gdb/top.c:545
    #21 0x000055555d0ec7c8 in catch_command_errors (command=0x55555ce01bb8 <attach_command(char const*, int)>, arg=0x7fffffffe28d "1472010", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:452
    #22 0x000055555d0f03ad in captured_main_1 (context=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1149
    #23 0x000055555d0f1239 in captured_main (data=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1232
    #24 0x000055555d0f1315 in gdb_main (args=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1257
    #25 0x000055555bb70cf9 in main (argc=7, argv=0x7fffffffde88) at /home/simark/src/binutils-gdb/gdb/gdb.c:32

Catching the exception at this point means that the thread_db_info
object for this inferior will be left in place, despite the failure to
load libthread_db.  This means that there won't be further attempts at
loading libthread_db, because thread_db_load will think that
libthread_db is already loaded for this inferior and will always exit
early.  To fix this, add a try/catch around calling try_thread_db_load_1
in try_thread_db_load, such that if some exception is thrown while
trying to load libthread_db, we reset / delete the thread_db_info for
that inferior.  That alone makes attach work fine again, because
check_for_thread_db is called again in the thread_db_inferior_created
observer (that happens after we learned about all shared libraries and
their symbols), and libthread_db is successfully loaded then.

When attaching, I think that the inferior_created observer is a good
place to try to load libthread_db: it is called once everything has
stabilized, when we learned about all shared libraries.

The only problem then is that when we first try (and fail) to load
libthread_db, in reaction to learning about libpthread, we show this
warning:

    warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.

This is misleading, because we do succeed in loading it later.  So when
attaching, I think we shouldn't try to load libthread_db in reaction to
the new_objfile events, we should wait until we have learned about all
shared libraries (using the inferior_created observable).  To do so, add
an `in_initial_library_scan` flag to struct inferior.  This flag is used
to postpone loading libthread_db if we are attaching or handling a fork
child.

When debugging remotely with GDBserver, the same problem happens, except
that the qSymbol mechanism (allowing the remote side to ask GDB for
symbols values) is involved.  The fix there is the same idea, we make
GDB wait until all shared libraries and their symbols are known before
sending out a qSymbol packet.  This way, we never present the remote
side a state where libpthread.so's symbols are known but ld-linux's
symbols aren't.

gdb/ChangeLog:

	* inferior.h (class inferior) <in_initial_library_scan>: New.
	* infcmd.c (post_create_inferior): Set in_initial_library_scan.
	* infrun.c (follow_fork_inferior): Likewise.
	* linux-thread-db.c (try_thread_db_load): Catch exception thrown
	by try_thread_db_load_1
	(thread_db_load): Return early if in_initial_library_scan is
	set.
	* remote.c (remote_new_objfile): Return early if
	in_initial_library_scan is set.

Change-Id: I7a279836cfbb2b362b4fde11b196b4aab82f5efb
pipcet pushed a commit that referenced this issue Jun 17, 2021
One consequence of changing libpthread_name_p() in solib.c to (also)
match libc is that the symbols for libc will now be loaded by
solib_add() in solib.c.  I think this is mostly harmless because
we'll likely want these symbols to be loaded anyway, but it did cause
two failures in gdb.base/print-symbol-loading.exp.

Specifically...

1)

sharedlibrary .*
(gdb) PASS: gdb.base/print-symbol-loading.exp: shlib off: load shared-lib

now looks like this:

sharedlibrary .*
Symbols already loaded for /lib64/libc.so.6
(gdb) PASS: gdb.base/print-symbol-loading.exp: shlib off: load shared-lib

2)

sharedlibrary .*
Loading symbols for shared libraries: .*
(gdb) PASS: gdb.base/print-symbol-loading.exp: shlib brief: load shared-lib

now looks like this:

sharedlibrary .*
Loading symbols for shared libraries: .*
Symbols already loaded for /lib64/libc.so.6
(gdb) PASS: gdb.base/print-symbol-loading.exp: shlib brief: load shared-lib

Fixing case #2 ended up being easier than #1.  #1 had been using
gdb_test_no_output to correctly match this no-output case.  I
ended up replacing it with gdb_test_multiple, matching the exact
expected output for each of the two now acceptable cases.

For case #2, I simply added an optional non-capturing group
for the potential new output.

gdb/testsuite/ChangeLog:

	* gdb.base/print-symbol-loading.exp (proc test_load_shlib):
	Allow "Symbols already loaded for..." messages.
pipcet pushed a commit that referenced this issue Jun 30, 2021
When loading a mach-o (macOS) executable and trying to set a breakpoint,
a GDB built with ASan or -D_GLIBCXX_DEBUG will crash with an
out-of-bound vector access.  This can be reproduced on Linux using the
repro files in bug 28017 [1]:

    $ ./gdb -nx --data-directory=data-directory -q repro/test -ex "b main" -batch
    /usr/include/c++/11.1.0/debug/vector:445:
    In function:
        std::__debug::vector<_Tp, _Allocator>::const_reference
        std::__debug::vector<_Tp,
        _Allocator>::operator[](std::__debug::vector<_Tp,
        _Allocator>::size_type) const [with _Tp = long unsigned int; _Allocator
        = std::allocator<long unsigned int>; std::__debug::vector<_Tp,
        _Allocator>::const_reference = const long unsigned int&;
        std::__debug::vector<_Tp, _Allocator>::size_type = long unsigned int]

    Error: attempt to subscript container with out-of-bounds index 13, but
    container only holds 13 elements.

    Objects involved in the operation:
        sequence "this" @ 0x0x61300000a590 {
          type = std::__debug::vector<unsigned long, std::allocator<unsigned long> >;
        }

The out-of-bound access happens here:

    #0  0x00007ffff6405d22 in raise () from /usr/lib/libc.so.6
    #1  0x00007ffff63ef862 in abort () from /usr/lib/libc.so.6
    #2  0x00007ffff664e21e in __gnu_debug::_Error_formatter::_M_error() const [clone .cold] from /usr/lib/libstdc++.so.6
    #3  0x000055555699e5ff in std::__debug::vector<unsigned long, std::allocator<unsigned long> >::operator[] (this=0x61300000a590, __n=13) at /usr/include/c++/11.1.0/debug/vector:445
    #4  0x0000555556a58c17 in objfile::section_offset (this=0x61300000a4c0, section=0x55555bbe4ac0 <_bfd_std_section>) at /home/simark/src/binutils-gdb/gdb/objfiles.h:644
    #5  0x0000555556a58cac in obj_section::offset (this=0x62100016d2a8) at /home/simark/src/binutils-gdb/gdb/objfiles.h:838
    #6  0x0000555556a58cfa in obj_section::addr (this=0x62100016d2a8) at /home/simark/src/binutils-gdb/gdb/objfiles.h:850
    #7  0x000055555779f5f7 in sort_cmp (sect1=0x62100016d2a8, sect2=0x62100016d170) at /home/simark/src/binutils-gdb/gdb/objfiles.c:902
    bminor#8  0x00005555577aae35 in __gnu_cxx::__ops::_Iter_comp_iter<bool (*)(obj_section const*, obj_section const*)>::operator()<obj_section**, obj_section**> (this=0x7fffffffa9e0, __it1=0x60c000015970, __it2=0x60c000015940) at /usr/include/c++/11.1.0/bits/predefined_ops.h:158
    bminor#9  0x00005555577aa2b8 in std::__insertion_sort<obj_section**, __gnu_cxx::__ops::_Iter_comp_iter<bool (*)(obj_section const*, obj_section const*)> > (__first=0x60c000015940, __last=0x60c0000159c0, __comp=...) at /usr/include/c++/11.1.0/bits/stl_algo.h:1826
    bminor#10 0x00005555577a8e26 in std::__final_insertion_sort<obj_section**, __gnu_cxx::__ops::_Iter_comp_iter<bool (*)(obj_section const*, obj_section const*)> > (__first=0x60c000015940, __last=0x60c0000159c0, __comp=...) at /usr/include/c++/11.1.0/bits/stl_algo.h:1871
    bminor#11 0x00005555577a723c in std::__sort<obj_section**, __gnu_cxx::__ops::_Iter_comp_iter<bool (*)(obj_section const*, obj_section const*)> > (__first=0x60c000015940, __last=0x60c0000159c0, __comp=...) at /usr/include/c++/11.1.0/bits/stl_algo.h:1957
    bminor#12 0x00005555577a50f4 in std::sort<obj_section**, bool (*)(obj_section const*, obj_section const*)> (__first=0x60c000015940, __last=0x60c0000159c0, __comp=0x55555779f4e7 <sort_cmp(obj_section const*, obj_section const*)>) at /usr/include/c++/11.1.0/bits/stl_algo.h:4875
    #13 0x00005555577a147e in update_section_map (pspace=0x61200001d2c0, pmap=0x6030000d40b0, pmap_size=0x6030000d40b8) at /home/simark/src/binutils-gdb/gdb/objfiles.c:1165
    #14 0x00005555577a19a0 in find_pc_section (pc=0x100003fa0) at /home/simark/src/binutils-gdb/gdb/objfiles.c:1212
    #15 0x00005555576dd39e in lookup_minimal_symbol_by_pc_section (pc_in=0x100003fa0, section=0x0, prefer=lookup_msym_prefer::TEXT, previous=0x0) at /home/simark/src/binutils-gdb/gdb/minsyms.c:750
    #16 0x00005555576de552 in lookup_minimal_symbol_by_pc (pc=0x100003fa0) at /home/simark/src/binutils-gdb/gdb/minsyms.c:986
    #17 0x0000555557d44b54 in find_pc_sect_line (pc=0x100003fa0, section=0x62100016d170, notcurrent=0) at /home/simark/src/binutils-gdb/gdb/symtab.c:3163
    #18 0x0000555557d489fa in find_function_start_sal_1 (func_addr=0x100003fa0, section=0x62100016d170, funfirstline=true) at /home/simark/src/binutils-gdb/gdb/symtab.c:3650
    #19 0x0000555557d49015 in find_function_start_sal (sym=0x621000191670, funfirstline=true) at /home/simark/src/binutils-gdb/gdb/symtab.c:3706
    #20 0x0000555557485283 in symbol_to_sal (result=0x7fffffffbb30, funfirstline=1, sym=0x621000191670) at /home/simark/src/binutils-gdb/gdb/linespec.c:4460
    #21 0x00005555574728c2 in convert_linespec_to_sals (state=0x7fffffffc390, ls=0x7fffffffc3e0) at /home/simark/src/binutils-gdb/gdb/linespec.c:2335
    #22 0x0000555557475a8e in parse_linespec (parser=0x7fffffffc360, arg=0x60200007a550 "main", match_type=symbol_name_match_type::WILD) at /home/simark/src/binutils-gdb/gdb/linespec.c:2716
    #23 0x0000555557479027 in event_location_to_sals (parser=0x7fffffffc360, location=0x606000097be0) at /home/simark/src/binutils-gdb/gdb/linespec.c:3173
    #24 0x00005555574798f7 in decode_line_full (location=0x606000097be0, flags=1, search_pspace=0x0, default_symtab=0x0, default_line=0, canonical=0x7fffffffcca0, select_mode=0x0, filter=0x0) at /home/simark/src/binutils-gdb/gdb/linespec.c:3253
    #25 0x0000555556b4949f in parse_breakpoint_sals (location=0x606000097be0, canonical=0x7fffffffcca0) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:9134
    #26 0x0000555556b6ce95 in create_sals_from_location_default (location=0x606000097be0, canonical=0x7fffffffcca0, type_wanted=bp_breakpoint) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:13819
    #27 0x0000555556b645a6 in bkpt_create_sals_from_location (location=0x606000097be0, canonical=0x7fffffffcca0, type_wanted=bp_breakpoint) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:12631
    #28 0x0000555556b4badf in create_breakpoint (gdbarch=0x621000152d10, location=0x606000097be0, cond_string=0x0, thread=0, extra_string=0x0, force_condition=false, parse_extra=1, tempflag=0, type_wanted=bp_breakpoint, ignore_count=0, pending_break_support=AUTO_BOOLEAN_AUTO, ops=0x55555bd728a0 <bkpt_breakpoint_ops>, from_tty=0, enabled=1, internal=0, flags=0) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:9410
    #29 0x0000555556b4d3b1 in break_command_1 (arg=0x7fffffffe291 "", flag=0, from_tty=0) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:9590
    #30 0x0000555556b4dc1b in break_command (arg=0x7fffffffe28d "main", from_tty=0) at /home/simark/src/binutils-gdb/gdb/breakpoint.c:9660
    #31 0x0000555556d24ca9 in do_const_cfunc (c=0x61100003a240, args=0x7fffffffe28d "main", from_tty=0) at /home/simark/src/binutils-gdb/gdb/cli/cli-decode.c:102
    #32 0x0000555556d2fcd3 in cmd_func (cmd=0x61100003a240, args=0x7fffffffe28d "main", from_tty=0) at /home/simark/src/binutils-gdb/gdb/cli/cli-decode.c:2160
    #33 0x0000555557e84e93 in execute_command (p=0x7fffffffe290 "n", from_tty=0) at /home/simark/src/binutils-gdb/gdb/top.c:674
    #34 0x00005555575a9933 in catch_command_errors (command=0x555557e84043 <execute_command(char const*, int)>, arg=0x7fffffffe28b "b main", from_tty=0, do_bp_actions=true) at /home/simark/src/binutils-gdb/gdb/main.c:523
    #35 0x00005555575a9fdb in execute_cmdargs (cmdarg_vec=0x7fffffffd910, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x7fffffffd5b0) at /home/simark/src/binutils-gdb/gdb/main.c:618
    #36 0x00005555575ad48a in captured_main_1 (context=0x7fffffffdd00) at /home/simark/src/binutils-gdb/gdb/main.c:1322
    #37 0x00005555575ada9c in captured_main (data=0x7fffffffdd00) at /home/simark/src/binutils-gdb/gdb/main.c:1343
    #38 0x00005555575adb31 in gdb_main (args=0x7fffffffdd00) at /home/simark/src/binutils-gdb/gdb/main.c:1368
    #39 0x000055555681e179 in main (argc=8, argv=0x7fffffffde78) at /home/simark/src/binutils-gdb/gdb/gdb.c:32

The section being dealt with at that moment is the special *COM*
section:

    (top-gdb) p section.name
    $1 = 0x55555a1bbe60 "*COM*"
    (top-gdb) p section
    $2 = (bfd_section *) 0x55555bbe4ac0 <_bfd_std_section>

I'm not too sure what this section is for, but this is one of four
special BFD sections that GDB puts after the regular sections in the
objfile::sections and objfile::section_offsets lists.  You can check
gdb_bfd_section_index to see how they are handled.
gdb_bfd_count_sections returns "+ 4" to account for those sections.

The problem is that macho_symfile_offsets uses bfd_count_sections
instead of gdb_bfd_count_sections when allocating the
objfile::section_offsets vector.  The vector will therefore contain,
say, 13 elements instead of 17.  When trying to access the section
offset of the *COM* section, the first after the regular sections, we
access section_offsets[13], which is out of bounds.

Fix that by using gdb_bfd_count_sections instead of bfd_count_sections.
I'm fairly confident that this is correct, as this is what
default_symfile_offsets does.

With this patch, the command shown above terminates normally:

    $ ./gdb -nx --data-directory=data-directory -q repro/test -ex "b main" -batch
    Breakpoint 1 at 0x100003fad: file test.c, line 2.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=28017

gdb/ChangeLog:

	PR gdb/28017
	* machoread.c (macho_symfile_offsets): Use
	gdb_bfd_count_sections to allocate objfile::section_offsets.

Change-Id: Ic3a56f46f7232e9f24581f8255fc1ab981935450
pipcet pushed a commit that referenced this issue Jul 1, 2021
Currently, on GNU/Linux, if you try to access memory and you have a
running thread selected, GDB fails the memory accesses, like:

 (gdb) c&
 Continuing.
 (gdb) p global_var
 Cannot access memory at address 0x555555558010

Or:

 (gdb) b main
 Breakpoint 2 at 0x55555555524d: file access-mem-running.c, line 59.
 Warning:
 Cannot insert breakpoint 2.
 Cannot access memory at address 0x55555555524d

This patch removes this limitation.  It teaches the native Linux
target to read/write memory even if the target is running.  And it
does this without temporarily stopping threads.  We now get:

 (gdb) c&
 Continuing.
 (gdb) p global_var
 $1 = 123
 (gdb) b main
 Breakpoint 2 at 0x555555555259: file access-mem-running.c, line 62.

(The scenarios above work correctly with current GDBserver, because
GDBserver temporarily stops all threads in the process whenever GDB
wants to access memory (see prepare_to_access_memory /
done_accessing_memory).  Freezing the whole process makes sense when
we need to be sure that we have a consistent view of memory and don't
race with the inferior changing it at the same time as GDB is
accessing it.  But I think that's a too-heavy hammer for the default
behavior.  I think that ideally, whether to stop all threads or not
should be policy decided by gdb core, probably best implemented by
exposing something like gdbserver's prepare_to_access_memory /
done_accessing_memory to gdb core.)

Currently, if we're accessing (reading/writing) just a few bytes, then
the Linux native backend does not try accessing memory via
/proc/<pid>/mem and goes straight to ptrace
PTRACE_PEEKTEXT/PTRACE_POKETEXT.  However, ptrace always fails when
the ptracee is running.  So the first step is to prefer
/proc/<pid>/mem even for small accesses.  Without further changes
however, that may cause a performance regression, due to constantly
opening and closing /proc/<pid>/mem for each memory access.  So the
next step is to keep the /proc/<pid>/mem file open across memory
accesses.  If we have this, then it doesn't make sense anymore to even
have the ptrace fallback, so the patch disables it.

I've made it such that GDB only ever has one /proc/<pid>/mem file open
at any time.  As long as a memory access hits the same inferior
process as the previous access, then we reuse the previously open
file.  If however, we access memory of a different process, then we
close the previous file and open a new one for the new process.

If we wanted, we could keep one /proc/<pid>/mem file open per
inferior, and never close them (unless the inferior exits or execs).
However, having seen bfd patches recently about hitting too many open
file descriptors, I kept the logic to have only one file open tops.
Also, we need to handle memory accesses for processes for which we
don't have an inferior object, for when we need to detach a
fork-child, and we'd probaly want to handle caching the open file for
that scenario (no inferior for process) too, which would probably end
up meaning caching for last non-inferior process, which is very much
what I'm proposing anyhow.  So always having one file open likely ends
up a smaller patch.

The next step is handling the case of GDB reading/writing memory
through a thread that is running and exits.  The access should not
result in a user-visible failure if the inferior/process is still
alive.

Once we manage to open a /proc/<lwpid>/mem file, then that file is
usable for memory accesses even if the corresponding lwp exits and is
reaped.  I double checked that trying to open the same
/proc/<lwpid>/mem path again fails because the lwp is really gone so
there's no /proc/<lwpid>/ entry on the filesystem anymore, but the
previously open file remains usable.  It's only when the whole process
execs that we need to reopen a new file.

When the kernel destroys the whole address space, i.e., when the
process exits or execs, the reads/writes fail with 0 aka EOF, in which
case there's nothing else to do than returning a memory access
failure.  Note this means that when we get an exec event, we need to
reopen the file, to access the process's new address space.

If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP
we're opening it for exits before we open it and before we reap the
LWP (i.e., the LWP is zombie), the open fails with EACCES.  The patch
handles this by just looking for another thread until it finds one
that we can open a /proc/<pid>/mem successfully for.

If we need to open (or reopen) the /proc/<pid>/mem file, and the LWP
we're opening has exited and we already reaped it, which is the case
if the selected thread is in THREAD_EXIT state, the open fails with
ENOENT.  The patch handles this the same way as a zombie race
(EACCES), instead of checking upfront whether we're accessing a
known-exited thread, because that would result in more complicated
code, because we also need to handle accessing lwps that are not
listed in the core thread list, and it's the core thread list that
records the THREAD_EXIT state.

The patch includes two testcases:

#1 - gdb.base/access-mem-running.exp

  This is the conceptually simplest - it is single-threaded, and has
  GDB read and write memory while the program is running.  It also
  tests setting a breakpoint while the program is running, and checks
  that the breakpoint is hit immediately.

#2 - gdb.threads/access-mem-running-thread-exit.exp

  This one is more elaborate, as it continuously spawns short-lived
  threads in order to exercise accessing memory just while threads are
  exiting.  It also spawns two different processes and alternates
  accessing memory between the two processes to exercise the reopening
  the /proc file frequently.  This also ends up exercising GDB reading
  from an exited thread frequently.  I confirmed by putting abort()
  calls in the EACCES/ENOENT paths added by the patch that we do hit
  all of them frequently with the testcase.  It also exits the
  process's main thread (i.e., the main thread becomes zombie), to
  make sure accessing memory in such a corner-case scenario works now
  and in the future.

The tests fail on GNU/Linux native before the code changes, and pass
after.  They pass against current GDBserver, again because GDBserver
supports memory access even if all threads are running, by
transparently pausing the whole process.

gdb/ChangeLog:
yyyy-mm-dd  Pedro Alves  <pedro@palves.net>

	PR mi/15729
	PR gdb/13463
	* linux-nat.c (linux_nat_target::detach): Close the
	/proc/<pid>/mem file if it was open for this process.
	(linux_handle_extended_wait) <PTRACE_EVENT_EXEC>: Close the
	/proc/<pid>/mem file if it was open for this process.
	(linux_nat_target::mourn_inferior): Close the /proc/<pid>/mem file
	if it was open for this process.
	(linux_nat_target::xfer_partial): Adjust.  Do not fall back to
	inf_ptrace_target::xfer_partial for memory accesses.
	(last_proc_mem_file): New.
	(maybe_close_proc_mem_file): New.
	(linux_proc_xfer_memory_partial_pid): New, with bits factored out
	from linux_proc_xfer_partial.
	(linux_proc_xfer_partial): Delete.
	(linux_proc_xfer_memory_partial): New.

gdb/testsuite/ChangeLog
yyyy-mm-dd  Pedro Alves  <pedro@palves.net>

	PR mi/15729
	PR gdb/13463
	* gdb.base/access-mem-running.c: New.
	* gdb.base/access-mem-running.exp: New.
	* gdb.threads/access-mem-running-thread-exit.c: New.
	* gdb.threads/access-mem-running-thread-exit.exp: New.

Change-Id: Ib3c082528872662a3fc0ca9b31c34d4876c874c9
pipcet pushed a commit that referenced this issue Jul 5, 2021
When loading a file using the file command on macOS, we get:

    $ ./gdb -nx --data-directory=data-directory -q -ex "file ./test"
    Reading symbols from ./test...
    Reading symbols from /Users/smarchi/build/binutils-gdb/gdb/test.dSYM/Contents/Resources/DWARF/test...
    /Users/smarchi/src/binutils-gdb/gdb/thread.c:72: internal-error: struct thread_info *inferior_thread(): Assertion `current_thread_ != nullptr' failed.
    A problem internal to GDB has been detected,
    further debugging may prove unreliable.
    Quit this debugging session? (y or n)

The backtrace is:

    * frame #0: 0x0000000101fcb826 gdb`internal_error(file="/Users/smarchi/src/binutils-gdb/gdb/thread.c", line=72, fmt="%s: Assertion `%s' failed.") at errors.cc:52:3
      frame #1: 0x00000001018a2584 gdb`inferior_thread() at thread.c:72:3
      frame #2: 0x0000000101469c09 gdb`get_current_regcache() at regcache.c:421:31
      frame #3: 0x00000001015f9812 gdb`darwin_solib_get_all_image_info_addr_at_init(info=0x0000603000006d00) at solib-darwin.c:464:34
      frame #4: 0x00000001015f7a04 gdb`darwin_solib_create_inferior_hook(from_tty=1) at solib-darwin.c:515:5
      frame #5: 0x000000010161205e gdb`solib_create_inferior_hook(from_tty=1) at solib.c:1200:3
      frame #6: 0x00000001016d8f76 gdb`symbol_file_command(args="./test", from_tty=1) at symfile.c:1650:7
      frame #7: 0x0000000100abab17 gdb`file_command(arg="./test", from_tty=1) at exec.c:555:3
      frame bminor#8: 0x00000001004dc799 gdb`do_const_cfunc(c=0x000061100000c340, args="./test", from_tty=1) at cli-decode.c:102:3
      frame bminor#9: 0x00000001004ea042 gdb`cmd_func(cmd=0x000061100000c340, args="./test", from_tty=1) at cli-decode.c:2160:7
      frame bminor#10: 0x00000001018d4f59 gdb`execute_command(p="t", from_tty=1) at top.c:674:2
      frame bminor#11: 0x0000000100eee430 gdb`catch_command_errors(command=(gdb`execute_command(char const*, int) at top.c:561), arg="file ./test", from_tty=1, do_bp_actions=true)(char const*, int), char const*, int, bool) at main.c:523:7
      frame bminor#12: 0x0000000100eee902 gdb`execute_cmdargs(cmdarg_vec=0x00007ffeefbfeba0 size=1, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND, ret=0x00007ffeefbfec20) at main.c:618:9
      frame #13: 0x0000000100eed3a4 gdb`captured_main_1(context=0x00007ffeefbff780) at main.c:1322:3
      frame #14: 0x0000000100ee810d gdb`captured_main(data=0x00007ffeefbff780) at main.c:1343:3
      frame #15: 0x0000000100ee8025 gdb`gdb_main(args=0x00007ffeefbff780) at main.c:1368:7
      frame #16: 0x00000001000044f1 gdb`main(argc=6, argv=0x00007ffeefbff8a0) at gdb.c:32:10
      frame #17: 0x00007fff20558f5d libdyld.dylib`start + 1

The solib_create_inferior_hook call in symbol_file_command was added by
commit ea142fb ("Fix breakpoints on file reloads for PIE
binaries").  It causes solib_create_inferior_hook to be called while
the inferior is not running, which darwin_solib_create_inferior_hook
does not expect.  darwin_solib_get_all_image_info_addr_at_init, in
particular, assumes that there is a current thread, as it tries to get
the current thread's regcache.

Fix it by adding a target_has_execution check and returning early.  Note
that there is a similar check in svr4_solib_create_inferior_hook.

gdb/ChangeLog:

	* solib-darwin.c (darwin_solib_create_inferior_hook): Return
	early if no execution.

Change-Id: Ia11dd983a1e29786e5ce663d0fcaa6846dc611bb
pipcet pushed a commit that referenced this issue Jul 16, 2021
Commit 408f668 ("detach in all-stop
with threads running") regressed "detach" with "target remote":

 (gdb) detach
 Detaching from program: target:/any/program, process 3671843
 Detaching from process 3671843
 Ending remote debugging.
 [Inferior 1 (process 3671843) detached]
 In main
 terminate called after throwing an instance of 'gdb_exception_error'
 Aborted (core dumped)

Here's the exception above being thrown:

 (top-gdb) bt
 #0  throw_error (error=TARGET_CLOSE_ERROR, fmt=0x555556035588 "Remote connection closed") at src/gdbsupport/common-exceptions.cc:222
 #1  0x0000555555bbaa46 in remote_target::readchar (this=0x555556a11040, timeout=10000) at src/gdb/remote.c:9440
 #2  0x0000555555bbb9e5 in remote_target::getpkt_or_notif_sane_1 (this=0x555556a11040, buf=0x555556a11058, forever=0, expecting_notif=0, is_notif=0x0) at src/gdb/remote.c:9928
 #3  0x0000555555bbbda9 in remote_target::getpkt_sane (this=0x555556a11040, buf=0x555556a11058, forever=0) at src/gdb/remote.c:10030
 #4  0x0000555555bc0e75 in remote_target::remote_hostio_send_command (this=0x555556a11040, command_bytes=13, which_packet=14, remote_errno=0x7fffffffcfd0, attachment=0x0, attachment_len=0x0) at src/gdb/remote.c:12137
 #5  0x0000555555bc1b6c in remote_target::remote_hostio_close (this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12455
 #6  0x0000555555bc1bb4 in remote_target::fileio_close (During symbol reading: .debug_line address at offset 0x64f417 is 0 [in module build/gdb/gdb]
 this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12462
 #7  0x0000555555c9274c in target_fileio_close (fd=3, target_errno=0x7fffffffcfd0) at src/gdb/target.c:3365
 bminor#8  0x000055555595a19d in gdb_bfd_iovec_fileio_close (abfd=0x555556b9f8a0, stream=0x555556b11530) at src/gdb/gdb_bfd.c:439
 bminor#9  0x0000555555e09e3f in opncls_bclose (abfd=0x555556b9f8a0) at src/bfd/opncls.c:599
 bminor#10 0x0000555555e0a2c7 in bfd_close_all_done (abfd=0x555556b9f8a0) at src/bfd/opncls.c:847
 bminor#11 0x0000555555e0a27a in bfd_close (abfd=0x555556b9f8a0) at src/bfd/opncls.c:814
 bminor#12 0x000055555595a9d3 in gdb_bfd_close_or_warn (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:626
 #13 0x000055555595ad29 in gdb_bfd_unref (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:715
 #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573
 #15 0x0000555555ae955a in std::_Sp_counted_ptr<objfile*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:377
 #16 0x000055555572b7c8 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:155
 #17 0x00005555557263c3 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x555556bf0588, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:730
 #18 0x0000555555ae745e in std::__shared_ptr<objfile, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:1169
 #19 0x0000555555ae747e in std::shared_ptr<objfile>::~shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr.h:103
 #20 0x0000555555b1c1dc in __gnu_cxx::new_allocator<std::_List_node<std::shared_ptr<objfile> > >::destroy<std::shared_ptr<objfile> > (this=0x5555564cdd60, __p=0x555556bf0580) at /usr/include/c++/9/ext/new_allocator.h:153
 #21 0x0000555555b1bb1d in std::allocator_traits<std::allocator<std::_List_node<std::shared_ptr<objfile> > > >::destroy<std::shared_ptr<objfile> > (__a=..., __p=0x555556bf0580) at /usr/include/c++/9/bits/alloc_traits.h:497
 #22 0x0000555555b1b73e in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::_M_erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/stl_list.h:1921
 #23 0x0000555555b1afeb in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/list.tcc:158
 #24 0x0000555555b19576 in program_space::remove_objfile (this=0x5555564cdd20, objfile=0x555556515540) at src/gdb/progspace.c:210
 #25 0x0000555555ae4502 in objfile::unlink (this=0x555556515540) at src/gdb/objfiles.c:487
 #26 0x0000555555ae5a12 in objfile_purge_solibs () at src/gdb/objfiles.c:875
 #27 0x0000555555c09686 in no_shared_libraries (ignored=0x0, from_tty=1) at src/gdb/solib.c:1236
 #28 0x00005555559e3f5f in detach_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:2769

So frame #28 already detached the remote process, and then we're
purging the shared libraries.  GDB had opened remote shared libraries
via the target: sysroot, so it tries closing them.  GDBserver is
tearing down already, so remote communication breaks down and we close
the remote target and throw TARGET_CLOSE_ERROR.

Note frame #14:

 #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573

That's a dtor, thus noexcept.  That's the reason for the
std::terminate.

Stepping back a bit, why do we still have open remote files if we've
managed to detach already, and, we're debugging with "target remote"?
The reason is that commit 408f668
makes detach_command hold a reference to the target, so the remote
target won't be finally closed until frame #28 returns.  It's closing
the target that invalidates target file I/O handles.

This commit fixes the issue by not relying on target_close to
invalidate the target file I/O handles, instead invalidate them
immediately in remote_unpush_target.  So when GDB purges the solibs,
and we end up in target_fileio_close (frame #7 above), there's nothing
to do, and we don't try to talk with the remote target anymore.

The regression isn't seen when testing with
--target_board=native-gdbserver, because that does "set sysroot" to
disable the "target:" sysroot, for test run speed reasons.  So this
commit adds a testcase that explicitly tests detach with "set sysroot
target:".

gdb/ChangeLog:
yyyy-mm-dd  Pedro Alves  <pedro@palves.net>

	PR gdb/28080
	* remote.c (remote_unpush_target): Invalidate file I/O target
	handles.
	* target.c (fileio_handles_invalidate_target): Make extern.
	* target.h (fileio_handles_invalidate_target): Declare.

gdb/testsuite/ChangeLog:
yyyy-mm-dd  Pedro Alves  <pedro@palves.net>

	PR gdb/28080
	* gdb.base/detach-sysroot-target.exp: New.
	* gdb.base/detach-sysroot-target.c: New.

Reported-By: Jonah Graham <jonah@kichwacoders.com>

Change-Id: I851234910172f42a1b30e731161376c344d2727d
pipcet pushed a commit that referenced this issue Jul 16, 2021
…080)

Before PR gdb/28080 was fixed by the previous patch, GDB was crashing
like this:

 (gdb) detach
 Detaching from program: target:/any/program, process 3671843
 Detaching from process 3671843
 Ending remote debugging.
 [Inferior 1 (process 3671843) detached]
 In main
 terminate called after throwing an instance of 'gdb_exception_error'
 Aborted (core dumped)

Here's the exception above being thrown:

 (top-gdb) bt
 #0  throw_error (error=TARGET_CLOSE_ERROR, fmt=0x555556035588 "Remote connection closed") at src/gdbsupport/common-exceptions.cc:222
 #1  0x0000555555bbaa46 in remote_target::readchar (this=0x555556a11040, timeout=10000) at src/gdb/remote.c:9440
 #2  0x0000555555bbb9e5 in remote_target::getpkt_or_notif_sane_1 (this=0x555556a11040, buf=0x555556a11058, forever=0, expecting_notif=0, is_notif=0x0) at src/gdb/remote.c:9928
 #3  0x0000555555bbbda9 in remote_target::getpkt_sane (this=0x555556a11040, buf=0x555556a11058, forever=0) at src/gdb/remote.c:10030
 #4  0x0000555555bc0e75 in remote_target::remote_hostio_send_command (this=0x555556a11040, command_bytes=13, which_packet=14, remote_errno=0x7fffffffcfd0, attachment=0x0, attachment_len=0x0) at src/gdb/remote.c:12137
 #5  0x0000555555bc1b6c in remote_target::remote_hostio_close (this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12455
 #6  0x0000555555bc1bb4 in remote_target::fileio_close (During symbol reading: .debug_line address at offset 0x64f417 is 0 [in module build/gdb/gdb]
 this=0x555556a11040, fd=8, remote_errno=0x7fffffffcfd0) at src/gdb/remote.c:12462
 #7  0x0000555555c9274c in target_fileio_close (fd=3, target_errno=0x7fffffffcfd0) at src/gdb/target.c:3365
 bminor#8  0x000055555595a19d in gdb_bfd_iovec_fileio_close (abfd=0x555556b9f8a0, stream=0x555556b11530) at src/gdb/gdb_bfd.c:439
 bminor#9  0x0000555555e09e3f in opncls_bclose (abfd=0x555556b9f8a0) at src/bfd/opncls.c:599
 bminor#10 0x0000555555e0a2c7 in bfd_close_all_done (abfd=0x555556b9f8a0) at src/bfd/opncls.c:847
 bminor#11 0x0000555555e0a27a in bfd_close (abfd=0x555556b9f8a0) at src/bfd/opncls.c:814
 bminor#12 0x000055555595a9d3 in gdb_bfd_close_or_warn (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:626
 #13 0x000055555595ad29 in gdb_bfd_unref (abfd=0x555556b9f8a0) at src/gdb/gdb_bfd.c:715
 #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573
 #15 0x0000555555ae955a in std::_Sp_counted_ptr<objfile*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:377
 #16 0x000055555572b7c8 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x555556c20db0) at /usr/include/c++/9/bits/shared_ptr_base.h:155
 #17 0x00005555557263c3 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x555556bf0588, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:730
 #18 0x0000555555ae745e in std::__shared_ptr<objfile, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr_base.h:1169
 #19 0x0000555555ae747e in std::shared_ptr<objfile>::~shared_ptr (this=0x555556bf0580, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/shared_ptr.h:103
 #20 0x0000555555b1c1dc in __gnu_cxx::new_allocator<std::_List_node<std::shared_ptr<objfile> > >::destroy<std::shared_ptr<objfile> > (this=0x5555564cdd60, __p=0x555556bf0580) at /usr/include/c++/9/ext/new_allocator.h:153
 #21 0x0000555555b1bb1d in std::allocator_traits<std::allocator<std::_List_node<std::shared_ptr<objfile> > > >::destroy<std::shared_ptr<objfile> > (__a=..., __p=0x555556bf0580) at /usr/include/c++/9/bits/alloc_traits.h:497
 #22 0x0000555555b1b73e in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::_M_erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/stl_list.h:1921
 #23 0x0000555555b1afeb in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::erase (this=0x5555564cdd60, __position=std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x555556515540}) at /usr/include/c++/9/bits/list.tcc:158
 #24 0x0000555555b19576 in program_space::remove_objfile (this=0x5555564cdd20, objfile=0x555556515540) at src/gdb/progspace.c:210
 #25 0x0000555555ae4502 in objfile::unlink (this=0x555556515540) at src/gdb/objfiles.c:487
 #26 0x0000555555ae5a12 in objfile_purge_solibs () at src/gdb/objfiles.c:875
 #27 0x0000555555c09686 in no_shared_libraries (ignored=0x0, from_tty=1) at src/gdb/solib.c:1236
 #28 0x00005555559e3f5f in detach_command (args=0x0, from_tty=1) at src/gdb/infcmd.c:2769

Note frame #14:

 #14 0x0000555555ae4730 in objfile::~objfile (this=0x555556515540, __in_chrg=<optimized out>) at src/gdb/objfiles.c:573

That's a dtor, thus noexcept.  That's the reason for the
std::terminate.

The previous patch fixed things such that the exception above isn't
thrown anymore.  However, it's possible that e.g., the remote
connection drops just while a user types "nosharedlibrary", or some
other reason that leads to objfile::~objfile, and then we end up the
same std::terminate problem.

Also notice that frames bminor#9-bminor#11 are BFD frames:

 bminor#9  0x0000555555e09e3f in opncls_bclose (abfd=0x555556bc27e0) at src/bfd/opncls.c:599
 bminor#10 0x0000555555e0a2c7 in bfd_close_all_done (abfd=0x555556bc27e0) at src/bfd/opncls.c:847
 bminor#11 0x0000555555e0a27a in bfd_close (abfd=0x555556bc27e0) at src/bfd/opncls.c:814

BFD is written in C and thus throwing exceptions over such frames may
either not clean up properly, or, may abort if bfd is not compiled
with -fasynchronous-unwind-tables (x86-64 defaults that on, but not
all GCC ports do).

Thus frame bminor#8 seems like a good place to swallow exceptions.  More so
since in this spot we already ignore target_fileio_close return
errors.  That's what this commit does.  Without the previous fix, we'd
see:

 (gdb) detach
 Detaching from program: target:/any/program, process 2197701
 Ending remote debugging.
 [Inferior 1 (process 2197701) detached]
 warning: cannot close "target:/lib64/ld-linux-x86-64.so.2": Remote connection closed

Note it prints a warning, which would still be a regression compared
to GDB 10, if it weren't for the previous fix.

gdb/ChangeLog:
yyyy-mm-dd  Pedro Alves  <pedro@palves.net>

	PR gdb/28080
	* gdb_bfd.c (gdb_bfd_close_warning): New.
	(gdb_bfd_iovec_fileio_close): Wrap target_fileio_close in
	try/catch and print warning on exception.
	(gdb_bfd_close_or_warn): Use gdb_bfd_close_warning.

Change-Id: Ic7a26ddba0a4444e3377b0e7c1c89934a84545d7
pipcet pushed a commit that referenced this issue Jul 20, 2021
When the architecture supports memory tagging, we handle
pointer/reference types in a special way, so we can validate tags and
show mismatches.

Unfortunately, the currently implementation errors out when the user
prints non-address values: composite types, floats, references, member
functions and other things.

Vector registers:

 (gdb) p $v0
 Value can't be converted to integer.

Non-existent internal variables:

 (gdb) p $foo
 Value can't be converted to integer.

The same happens for complex types and printing struct/union types.

There are a few problems here.

The first one is that after print_command_1 evaluates the expression
to print, the tag validation code call value_as_address
unconditionally, without making sure we have have a suitable type
where it makes to sense to call it.  That results in value_as_address
(if it isn't given a pointer-like type) trying to treat the value as
an integer and convert it to an address, which #1 - doesn't make sense
(i.e., no sense in validating tags after "print 1"), and throws for
non-integer-convertible types.  We fix this by making sure we have a
pointer or reference type first, and only if so then proceed to check
if the address-like value has tags.

The second is that we're calling value_as_address even if we have an
optimized out or unavailable value, which throws, because the value's
contents aren't fully accessible/readable.  This error currently
escapes out and aborts the print.  This case is fixed by checking for
optimized out / unavailable explicitly.

Third, the tag checking process does not gracefully handle exceptions.
If any exception is thrown from the tag validation code, we abort the
print.  E.g., the target may fail to access tags via a running thread.
Or the needed /proc files aren't available.  Or some other untold
reason.  This is a bit too rigid.  This commit changes print_command_1
to catch errors, print them, and still continue with the normal
expression printing path instead of erroring out and printing nothing
useful.

With this patch, printing works correctly again:

 (gdb) p $v0
 $1 = {d = {f = {2.0546950501119882e-81, 2.0546950501119882e-81}, u = {3399988123389603631, 3399988123389603631}, s = {
       3399988123389603631, 3399988123389603631}}, s = {f = {1.59329203e-10, 1.59329203e-10, 1.59329203e-10, 1.59329203e-10}, u = {
       791621423, 791621423, 791621423, 791621423}, s = {791621423, 791621423, 791621423, 791621423}}, h = {bf = {1.592e-10,
       1.592e-10, 1.592e-10, 1.592e-10, 1.592e-10, 1.592e-10, 1.592e-10, 1.592e-10}, f = {0.11224, 0.11224, 0.11224, 0.11224, 0.11224,
       0.11224, 0.11224, 0.11224}, u = {12079, 12079, 12079, 12079, 12079, 12079, 12079, 12079}, s = {12079, 12079, 12079, 12079,
       12079, 12079, 12079, 12079}}, b = {u = {47 <repeats 16 times>}, s = {47 <repeats 16 times>}}, q = {u = {
       62718710765820030520700417840365121327}, s = {62718710765820030520700417840365121327}}}
 (gdb) p $foo
 $2 = void
 (gdb) p 2 + 2i
 $3 = 2 + 2i

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28110
pipcet pushed a commit that referenced this issue Jul 31, 2021
Simon Marchi tried gdb on OpenBSD, and it immediately segfaults when
running a program.  Simon tracked down the problem to x86_dr_low.get_status
being nullptr at this point:

    (lldb) print x86_dr_low.get_status
    (unsigned long (*)()) $0 = 0x0000000000000000
    (lldb) bt
    * thread #1, stop reason = step over
      * frame #0: 0x0000033b64b764aa gdb`x86_dr_stopped_data_address(state=0x0000033d7162a310, addr_p=0x00007f7ffffc5688) at x86-dregs.c:645:12
        frame #1: 0x0000033b64b766de gdb`x86_dr_stopped_by_watchpoint(state=0x0000033d7162a310) at x86-dregs.c:687:10
        frame #2: 0x0000033b64ea5f72 gdb`x86_stopped_by_watchpoint() at x86-nat.c:206:10
        frame #3: 0x0000033b64637fbb gdb`x86_nat_target<obsd_nat_target>::stopped_by_watchpoint(this=0x0000033b65252820) at x86-nat.h:100:12
        frame #4: 0x0000033b64d3ff11 gdb`target_stopped_by_watchpoint() at target.c:468:46
        frame #5: 0x0000033b6469b001 gdb`watchpoints_triggered(ws=0x00007f7ffffc61c8) at breakpoint.c:4790:32
        frame #6: 0x0000033b64a8bb8b gdb`handle_signal_stop(ecs=0x00007f7ffffc61a0) at infrun.c:6072:29
        frame #7: 0x0000033b64a7e3a7 gdb`handle_inferior_event(ecs=0x00007f7ffffc61a0) at infrun.c:5694:7
        frame bminor#8: 0x0000033b64a7c1a0 gdb`fetch_inferior_event() at infrun.c:4090:5
        frame bminor#9: 0x0000033b64a51921 gdb`inferior_event_handler(event_type=INF_REG_EVENT) at inf-loop.c:41:7
        frame bminor#10: 0x0000033b64a827c9 gdb`infrun_async_inferior_event_handler(data=0x0000000000000000) at infrun.c:9384:3
        frame bminor#11: 0x0000033b6465bd4f gdb`check_async_event_handlers() at async-event.c:335:4
        frame bminor#12: 0x0000033b65070917 gdb`gdb_do_one_event() at event-loop.cc:216:10
        frame #13: 0x0000033b64af0db1 gdb`start_event_loop() at main.c:421:13
        frame #14: 0x0000033b64aefe9a gdb`captured_command_loop() at main.c:481:3
        frame #15: 0x0000033b64aed5c2 gdb`captured_main(data=0x00007f7ffffc6470) at main.c:1353:4
        frame #16: 0x0000033b64aed4f2 gdb`gdb_main(args=0x00007f7ffffc6470) at main.c:1368:7
        frame #17: 0x0000033b6459d787 gdb`main(argc=5, argv=0x00007f7ffffc6518) at gdb.c:32:10
        frame #18: 0x0000033b6459d521 gdb`___start + 321

On BSDs, get_status is set in _initialize_x86_bsd_nat, but only if
HAVE_PT_GETDBREGS is defined.  PT_GETDBREGS doesn't exist on OpenBSD, so
get_status (and the other fields of x86_dr_low) are left as nullptr.

OpenBSD doesn't support getting or setting the x86 debug registers, so
fix by omitting debug register support entirely on OpenBSD:

- Change x86bsd_nat_target to only inherit from x86_nat_target if
  PT_GETDBREGS is supported.

- Don't include x86-nat.o and nat/x86-dregs.o for OpenBSD/amd64.  They
  were already omitted for OpenBSD/i386.
pipcet pushed a commit that referenced this issue Aug 7, 2021
In PR28004 the following warning / Internal error is reported:
...
$ gdb -q -batch \
    -iex "set sysroot $(pwd -P)/repro" \
    ./repro/gdb \
    ./repro/core \
    -ex bt
  ...
 Program terminated with signal SIGABRT, Aborted.
 #0  0x00007ff8fe8e5d22 in raise () from repro/usr/lib/libc.so.6
 [Current thread is 1 (LWP 1762498)]
 #1  0x00007ff8fe8cf862 in abort () from repro/usr/lib/libc.so.6
 warning: (Internal error: pc 0x7ff8feb2c21d in read in psymtab, \
           but not in symtab.)
 warning: (Internal error: pc 0x7ff8feb2c218 in read in psymtab, \
           but not in symtab.)
  ...
 #2  0x00007ff8feb2c21e in __gnu_debug::_Error_formatter::_M_error() const \
   [clone .cold] (warning: (Internal error: pc 0x7ff8feb2c21d in read in \
   psymtab, but not in symtab.)

) from repro/usr/lib/libstdc++.so.6
...

The warning is about the following:
- in find_pc_sect_compunit_symtab we try to find the address
  (0x7ff8feb2c218 / 0x7ff8feb2c21d) in the symtabs.
- that fails, so we try again in the partial symtabs.
- we find a matching partial symtab
- however, the partial symtab has a full symtab, so
  we should have found a matching symtab in the first step.

The addresses are:
...
(gdb) info sym 0x7ff8feb2c218
__gnu_debug::_Error_formatter::_M_error() const [clone .cold] in \
  section .text of repro/usr/lib/libstdc++.so.6
(gdb) info sym 0x7ff8feb2c21d
__gnu_debug::_Error_formatter::_M_error() const [clone .cold] + 5 in \
  section .text of repro/usr/lib/libstdc++.so.6
...
which correspond to unrelocated addresses 0x9c218 and 0x9c21d:
...
$ nm -C  repro/usr/lib/libstdc++.so.6.0.29 | grep 000000000009c218
000000000009c218 t __gnu_debug::_Error_formatter::_M_error() const \
  [clone .cold]
...
which belong to function __gnu_debug::_Error_formatter::_M_error() in
/build/gcc/src/gcc/libstdc++-v3/src/c++11/debug.cc.

The partial symtab that is found for the addresses is instead the one for
/build/gcc/src/gcc/libstdc++-v3/src/c++98/bitmap_allocator.cc, which is
incorrect.

This happens as follows.

The bitmap_allocator.cc CU has DW_AT_ranges at .debug_rnglist offset 0x4b50:
...
    00004b50 0000000000000000 0000000000000056
    00004b5a 00000000000a4790 00000000000a479c
    00004b64 00000000000a47a0 00000000000a47ac
...

When reading the first range 0x0..0x56, it doesn't trigger the "start address
of zero" complaint here:
...
      /* A not-uncommon case of bad debug info.
         Don't pollute the addrmap with bad data.  */
      if (range_beginning + baseaddr == 0
          && !per_objfile->per_bfd->has_section_at_zero)
        {
          complaint (_(".debug_rnglists entry has start address of zero"
                       " [in module %s]"), objfile_name (objfile));
          continue;
        }
...
because baseaddr != 0, which seems incorrect given that when loading the
shared library individually in gdb (and consequently baseaddr == 0), we do see
the complaint.

Consequently, we run into this case in dwarf2_get_pc_bounds:
...
  if (low == 0 && !per_objfile->per_bfd->has_section_at_zero)
    return PC_BOUNDS_INVALID;
...
which then results in this code in process_psymtab_comp_unit_reader being
called with cu_bounds_kind == PC_BOUNDS_INVALID, which sets the set_addrmap
argument to 1:
...
      scan_partial_symbols (first_die, &lowpc, &highpc,
                            cu_bounds_kind <= PC_BOUNDS_INVALID, cu);
...
and consequently, the CU addrmap gets build using address info from the
functions.

During that process, addrmap_set_empty is called with a range that includes
0x9c218 and 0x9c21d:
...
(gdb) p /x start
$7 = 0x9989c
(gdb) p /x end_inclusive
$8 = 0xb200d
...
but it's called for a function at DIE 0x54153 with DW_AT_ranges at 0x40ae:
...
    000040ae 00000000000b1ee0 00000000000b200e
    000040b9 000000000009989c 00000000000998c4
    000040c3 <End of list>
...
and neither range includes 0x9c218 and 0x9c21d.

This is caused by this code in partial_die_info::read:
...
            if (dwarf2_ranges_read (ranges_offset, &lowpc, &highpc, cu,
                                    nullptr, tag))
             has_pc_info = 1;
...
which pretends that the function is located at addresses 0x9989c..0xb200d,
which is indeed not the case.

This patch fixes the first problem encountered: fix the "start address of
zero" complaint warning by removing the baseaddr part from the condition.
Same for dwarf2_ranges_process.

The effect is that:
- the complaint is triggered, and
- the warning / Internal error is no longer triggered.

This does not fix the observed problem in partial_die_info::read, which is
filed as PR28200.

Tested on x86_64-linux.

Co-Authored-By: Simon Marchi <simon.marchi@polymtl.ca>

gdb/ChangeLog:

2021-07-29  Simon Marchi  <simon.marchi@polymtl.ca>
	    Tom de Vries  <tdevries@suse.de>

	PR symtab/28004
	* gdb/dwarf2/read.c (dwarf2_rnglists_process, dwarf2_ranges_process):
	Fix zero address complaint.
	* gdb/testsuite/gdb.dwarf2/dw2-zero-range-shlib.c: New test.
	* gdb/testsuite/gdb.dwarf2/dw2-zero-range.c: New test.
	* gdb/testsuite/gdb.dwarf2/dw2-zero-range.exp: New file.
pipcet pushed a commit that referenced this issue Aug 16, 2021
While working on the testsuite, I ended up noticing that GDB fails to
produce a full backtrace from a thread waiting in pthread_join.  When
selecting the waiting thread and using the 'bt' command, the following
result can be observed:

	(gdb) bt
	#0  0x0000003ff7fccd20 in __futex_abstimed_wait_common64 () from /lib/riscv64-linux-gnu/libpthread.so.0
	#1  0x0000003ff7fc43da in __pthread_clockjoin_ex () from /lib/riscv64-linux-gnu/libpthread.so.0
	Backtrace stopped: frame did not save the PC

On my platform, I do not have debug symbols for glibc, so I need to rely
on prologue analysis in order to unwind stack.

Here is what the function prologue looks like:

	(gdb) disassemble __pthread_clockjoin_ex
	Dump of assembler code for function __pthread_clockjoin_ex:
	   0x0000003ff7fc42de <+0>:     addi    sp,sp,-144
	   0x0000003ff7fc42e0 <+2>:     sd      s5,88(sp)
	   0x0000003ff7fc42e2 <+4>:     auipc   s5,0xd
	   0x0000003ff7fc42e6 <+8>:     ld      s5,-2(s5) # 0x3ff7fd12e0
	   0x0000003ff7fc42ea <+12>:    ld      a5,0(s5)
	   0x0000003ff7fc42ee <+16>:    sd      ra,136(sp)
	   0x0000003ff7fc42f0 <+18>:    sd      s0,128(sp)
	   0x0000003ff7fc42f2 <+20>:    sd      s1,120(sp)
	   0x0000003ff7fc42f4 <+22>:    sd      s2,112(sp)
	   0x0000003ff7fc42f6 <+24>:    sd      s3,104(sp)
	   0x0000003ff7fc42f8 <+26>:    sd      s4,96(sp)
	   0x0000003ff7fc42fa <+28>:    sd      s6,80(sp)
	   0x0000003ff7fc42fc <+30>:    sd      s7,72(sp)
	   0x0000003ff7fc42fe <+32>:    sd      s8,64(sp)
	   0x0000003ff7fc4300 <+34>:    sd      s9,56(sp)
	   0x0000003ff7fc4302 <+36>:    sd      a5,40(sp)

As far as prologue analysis is concerned, the most interesting part is
done at address 0x0000003ff7fc42ee (<+16>): 'sd ra,136(sp)'. This stores
the RA (return address) register on the stack, which is the information
we are looking for in order to identify the caller.

In the current implementation of the prologue scanner, GDB stops when
hitting 0x0000003ff7fc42e6 (<+8>) because it does not know what to do
with the 'ld' instruction.  GDB thinks it reached the end of the
prologue but have not yet reached the important part, which explain
GDB's inability to unwind past this point.

The section of the prologue starting at <+4> until <+12> is used to load
the stack canary[1], which will then be placed on the stack at <+36> at
the end of the prologue.

In order to have the prologue properly handled, this commit proposes to
add support for the ld instruction in the RISC-V prologue scanner.
I guess riscv32 would use lw in such situation so this patch also adds
support for this instruction.

With this patch applied, gdb is now able to unwind past pthread_join:

	(gdb) bt
	#0  0x0000003ff7fccd20 in __futex_abstimed_wait_common64 () from /lib/riscv64-linux-gnu/libpthread.so.0
	#1  0x0000003ff7fc43da in __pthread_clockjoin_ex () from /lib/riscv64-linux-gnu/libpthread.so.0
	#2  0x0000002aaaaaa88e in bar() ()
	#3  0x0000002aaaaaa8c4 in foo() ()
	#4  0x0000002aaaaaa8da in main ()

I have had a look to see if I could reproduce this easily, but in my
simple testcases using '-fstack-protector-all', the canary is loaded
after the RA register is saved.  I do not have a reliable way of
generating a prologue similar to the problematic one so I forged one
instead.

The testsuite have been run on riscv64 ubuntu 21.01 with no regression
observed.

[1] https://en.wikipedia.org/wiki/Buffer_overflow_protection#Canaries
pipcet pushed a commit that referenced this issue Sep 4, 2021
The original reproducer for PR28030 required use of a specific
compiler version - gcc-c++-11.1.1-3.fc34 is mentioned in the PR,
though it seems probable that other gcc versions might also be able to
reproduce the bug as well.  This commit introduces a test case which,
using the DWARF assembler, provides a reproducer which is independent
of the compiler version.  (Well, it'll work with whatever compilers
the DWARF assembler works with.)

To the best of my knowledge, it's also the first test case which uses
the DWARF assembler to provide debug info for a shared object.  That
being the case, I provided more than the usual commentary which should
allow this case to be used as a template when a combo shared
library / DWARF assembler test case is required in the future.

I provide some details regarding the bug in a comment near the
beginning of locexpr-dml.exp.

This problem was difficult to reproduce; I found myself constantly
referring to the backtrace while trying to figure out what (else) I
might be missing while trying to create a reproducer.  Below is a
partial backtrace which I include for posterity.

 #0  internal_error (
    file=0xc50110 "/ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/gdbtypes.c", line=5575,
    fmt=0xc520c0 "Unexpected type field location kind: %d")
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdbsupport/errors.cc:51
 #1  0x00000000006ef0c5 in copy_type_recursive (objfile=0x1635930,
    type=0x274c260, copied_types=0x30bb290)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/gdbtypes.c:5575
 #2  0x00000000006ef382 in copy_type_recursive (objfile=0x1635930,
    type=0x274ca10, copied_types=0x30bb290)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/gdbtypes.c:5602
 #3  0x0000000000a7409a in preserve_one_value (value=0x24269f0,
    objfile=0x1635930, copied_types=0x30bb290)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/value.c:2529
 #4  0x000000000072012a in gdbscm_preserve_values (
    extlang=0xc55720 <extension_language_guile>, objfile=0x1635930,
    copied_types=0x30bb290)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/guile/scm-value.c:94
 #5  0x00000000006a3f82 in preserve_ext_lang_values (objfile=0x1635930,
    copied_types=0x30bb290)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/extension.c:568
 #6  0x0000000000a7428d in preserve_values (objfile=0x1635930)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/value.c:2579
 #7  0x000000000082d514 in objfile::~objfile (this=0x1635930,
    __in_chrg=<optimized out>)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/objfiles.c:549
 bminor#8  0x0000000000831cc8 in std::_Sp_counted_ptr<objfile*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x1654580)
    at /usr/include/c++/11/bits/shared_ptr_base.h:348
 bminor#9  0x00000000004e6617 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x1654580) at /usr/include/c++/11/bits/shared_ptr_base.h:168
 bminor#10 0x00000000004e1d2f in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x190bb88, __in_chrg=<optimized out>)
    at /usr/include/c++/11/bits/shared_ptr_base.h:705
 bminor#11 0x000000000082feee in std::__shared_ptr<objfile, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x190bb80, __in_chrg=<optimized out>)
    at /usr/include/c++/11/bits/shared_ptr_base.h:1154
 bminor#12 0x000000000082ff0a in std::shared_ptr<objfile>::~shared_ptr (
    this=0x190bb80, __in_chrg=<optimized out>)
    at /usr/include/c++/11/bits/shared_ptr.h:122
 #13 0x000000000085ed7e in __gnu_cxx::new_allocator<std::_List_node<std::shared_ptr<objfile> > >::destroy<std::shared_ptr<objfile> > (this=0x114bc00,
    __p=0x190bb80) at /usr/include/c++/11/ext/new_allocator.h:168
 #14 0x000000000085e88d in std::allocator_traits<std::allocator<std::_List_node<std::shared_ptr<objfile> > > >::destroy<std::shared_ptr<objfile> > (__a=...,
    __p=0x190bb80) at /usr/include/c++/11/bits/alloc_traits.h:531
 #15 0x000000000085e50c in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::_M_erase (this=0x114bc00, __position=
  std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x1635930})
    at /usr/include/c++/11/bits/stl_list.h:1925
 #16 0x000000000085df0e in std::__cxx11::list<std::shared_ptr<objfile>, std::allocator<std::shared_ptr<objfile> > >::erase (this=0x114bc00, __position=
  std::shared_ptr<objfile> (expired, weak count 1) = {get() = 0x1635930})
    at /usr/include/c++/11/bits/list.tcc:158
 #17 0x000000000085c748 in program_space::remove_objfile (this=0x114bbc0,
    objfile=0x1635930)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/progspace.c:210
 #18 0x000000000082d3ae in objfile::unlink (this=0x1635930)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/objfiles.c:487
 #19 0x000000000082e68c in objfile_purge_solibs ()
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/objfiles.c:875
 #20 0x000000000092dd37 in no_shared_libraries (ignored=0x0, from_tty=1)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/solib.c:1236
 #21 0x00000000009a37fe in target_pre_inferior (from_tty=1)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/target.c:2496
 #22 0x00000000007454d6 in run_command_1 (args=0x0, from_tty=1,
    run_how=RUN_NORMAL)
    at /ironwood1/sourceware-git/f34-pr28030/bld/../../worktree-pr28030/gdb/infcmd.c:437

I'll note a few points regarding this backtrace:

Frame #1 is where the internal error occurs.  It's caused by an
unhandled case for FIELD_LOC_KIND_DWARF_BLOCK.  The fix for this bug
adds support for this case.

Frame #22 - it's a partial backtrace - shows that GDB is attempting to
(re)run the program.  You can see the exact command sequence that was
used for reproducing this problem in the PR (at
https://sourceware.org/bugzilla/show_bug.cgi?id=28030), but in a
nutshell, after starting the program and advancing to the appropriate
source line, GDB was asked to step into libstdc++; a "finish" command
was issued, returning a value.  The fact that a value was returned is
very important.  GDB was then used to step back into libstdc++.  A
breakpoint was set on a source line in the library after which a "run"
command was issued.

Frame #19 shows a call to objfile_purge_solibs.  It's aptly named.

Frame #7 is a call to the destructor for one of the objfile solibs; it
turned out to be the one for libstdc++.

Frames #6 thru #3 show various value preservation frames.  If you look
at preserve_values() in gdb/value.c, the value history is preserved
first, followed by internal variables, followed by values for the
extension languages (python and guile).
pipcet pushed a commit that referenced this issue Oct 12, 2021
…es.exp

When running test-case gdb.base/break-probes.exp on ubuntu 18.04.5, we have:
...
 (gdb) bt^M
 #0  0x00007ffff7dd6e12 in ?? () from /lib64/ld-linux-x86-64.so.2^M
 #1  0x00007ffff7dedf50 in ?? () from /lib64/ld-linux-x86-64.so.2^M
 #2  0x00007ffff7dd5128 in ?? () from /lib64/ld-linux-x86-64.so.2^M
 #3  0x00007ffff7dd4098 in ?? () from /lib64/ld-linux-x86-64.so.2^M
 #4  0x0000000000000001 in ?? ()^M
 #5  0x00007fffffffdaac in ?? ()^M
 #6  0x0000000000000000 in ?? ()^M
 (gdb) FAIL: gdb.base/break-probes.exp: ensure using probes
...

The test-case intends to emit an UNTESTED in this case, but fails to do so
because it tries to do it in a regexp clause in a gdb_test_multiple, which
doesn't trigger.  Instead, a default clause triggers which produces the FAIL.

Also the use of UNTESTED is not appropriate, and we should use UNSUPPORTED
instead.

Fix this by silencing the FAIL, and emitting an UNSUPPORTED after the
gdb_test_multiple:
...
 if { ! $using_probes } {
+    unsupported "probes not present on this system"
     return -1
 }
...

Tested on x86_64-linux.
pipcet pushed a commit that referenced this issue Oct 12, 2021
When running test-case gdb.base/break-probes.exp on ubuntu 18.04.5, we have:
...
 (gdb) run^M
 Starting program: break-probes^M
 Stopped due to shared library event (no libraries added or removed)^M
 (gdb) bt^M
 #0  0x00007ffff7dd6e12 in ?? () from /lib64/ld-linux-x86-64.so.2^M
 #1  0x00007ffff7dedf50 in ?? () from /lib64/ld-linux-x86-64.so.2^M
 #2  0x00007ffff7dd5128 in ?? () from /lib64/ld-linux-x86-64.so.2^M
 #3  0x00007ffff7dd4098 in ?? () from /lib64/ld-linux-x86-64.so.2^M
 #4  0x0000000000000001 in ?? ()^M
 #5  0x00007fffffffdaac in ?? ()^M
 #6  0x0000000000000000 in ?? ()^M
 (gdb) UNSUPPORTED: gdb.base/break-probes.exp: probes not present on this system
...

Using the backtrace, the test-case tries to establish that we're stopped in
dl_main, which is used as proof that we're using probes.

However, the backtrace only shows an address, because:
- the dynamic linker contains no minimal symbols and no debug info, and
- gdb is build without --with-separate-debug-dir so it can't find the
  corresponding .debug file, which does contain the mimimal symbols and
  debug info.

Fix this by instead printing the pc and grepping for the value in the
info probes output:
...
(gdb) p /x $pc^M
$1 = 0x7ffff7dd6e12^M
(gdb) info probes^M
Type Provider Name           Where              Object                      ^M
  ...
stap rtld     init_start     0x00007ffff7dd6e12 /lib64/ld-linux-x86-64.so.2 ^M
  ...
(gdb)
...

Tested on x86_64-linux.
pipcet pushed a commit that referenced this issue Oct 12, 2021
When running test-case gdb.base/break-interp.exp on ubuntu 18.04.5, we have:
...
 (gdb) bt^M
 #0  0x00007eff7ad5ae12 in ?? () from break-interp-LDprelinkNOdebugNO^M
 #1  0x00007eff7ad71f50 in ?? () from break-interp-LDprelinkNOdebugNO^M
 #2  0x00007eff7ad59128 in ?? () from break-interp-LDprelinkNOdebugNO^M
 #3  0x00007eff7ad58098 in ?? () from break-interp-LDprelinkNOdebugNO^M
 #4  0x0000000000000002 in ?? ()^M
 #5  0x00007fff505d7a32 in ?? ()^M
 #6  0x00007fff505d7a94 in ?? ()^M
 #7  0x0000000000000000 in ?? ()^M
 (gdb) FAIL: gdb.base/break-interp.exp: ldprelink=NO: ldsepdebug=NO: \
         first backtrace: dl bt
...

Using the backtrace, the test-case tries to establish that we're stopped in
dl_main.

However, the backtrace only shows an address, because:
- the dynamic linker contains no minimal symbols and no debug info, and
- gdb is build without --with-separate-debug-dir so it can't find the
  corresponding .debug file, which does contain the mimimal symbols and
  debug info.

As in "[gdb/testsuite] Improve probe detection in gdb.base/break-probes.exp",
fix this by doing info probes and grepping for the address.

Tested on x86_64-linux.
pipcet pushed a commit that referenced this issue Oct 12, 2021
I build gdb without xml support using --without-expat, and ran into:
...
(gdb) target remote | vgdb --wait=2 --max-invoke-ms=2500 --pid=22032^M
Remote debugging using | vgdb --wait=2 --max-invoke-ms=2500 --pid=22032^M
relaying data between gdb and process 22032^M
warning: Can not parse XML target description; XML support was disabled at \
  compile time^M
  ...
(gdb) PASS: gdb.base/valgrind-infcall.exp: continue #1
p gdb_test_infcall ()^M
Remote 'g' packet reply is too long (expected 560 bytes, got 800 bytes): ...^M
(gdb) FAIL: gdb.base/valgrind-infcall.exp: p gdb_test_infcall ()
...

After googling the error message with context valgrind gdbserver, I found
indications that the Remote 'g' packet reply error is due to missing xml
support.

And here ( https://www.valgrind.org/docs/manual/manual-core-adv.html ) I
found:
...
GDB version needed for ARM and PPC32/64.

You must use a GDB version which is able to read XML target description sent
by a gdbserver.  This is the standard setup if GDB was configured and built
with the "expat" library.  If your GDB was not configured with XML support, it
will report an error message when using the "target" command.  Debugging will
not work because GDB will then not be able to fetch the registers from the
Valgrind gdbserver.
...

So I guess I'm running into the same problem for x86_64.

Fix this by skipping all gdb.base/valgrind-*.exp tests if xml support is not
available.  Although only the gdb.base/valgrind-infcall*.exp produce fails,
the Remote 'g' packet reply error occurs in all tests, so it seems prudent to
disable them all.

Tested on x86_64-linux.
pipcet pushed a commit that referenced this issue Oct 12, 2021
The gdb.multi/multi-term-settings.exp testcase sometimes fails like so:

 Running /home/pedro/gdb/mygit/src/gdb/testsuite/gdb.multi/multi-term-settings.exp ...
 FAIL: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: stop with control-c (SIGINT)

It's easier to reproduce if you stress the machine at the same time, like e.g.:

  $ stress -c 24

Looking at gdb.log, we see:

 (gdb) attach 60422
 Attaching to program: build/gdb/testsuite/outputs/gdb.multi/multi-term-settings/multi-term-settings, process 60422
 [New Thread 60422.60422]
 Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...
 Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libc-2.31.so...
 Reading symbols from /lib64/ld-linux-x86-64.so.2...
 (No debugging symbols found in /lib64/ld-linux-x86-64.so.2)
 0x00007f2fc2485334 in __GI___clock_nanosleep (clock_id=<optimized out>, clock_id@entry <mailto:clock_id@entry>=0, flags=flags@entry <mailto:flags@entry>=0, req=req@entry <mailto:req@entry>=0x7ffe23126940, rem=rem@entry <mailto:rem@entry>=0x0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
 78	../sysdeps/unix/sysv/linux/clock_nanosleep.c: No such file or directory.
 (gdb) PASS: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: inf2: attach
 set schedule-multiple on
 (gdb) PASS: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: set schedule-multiple on
 info inferiors
   Num  Description       Connection                         Executable
   1    process 60404     1 (extended-remote localhost:2349) build/gdb/testsuite/outputs/gdb.multi/multi-term-settings/multi-term-settings
 * 2    process 60422     1 (extended-remote localhost:2349) build/gdb/testsuite/outputs/gdb.multi/multi-term-settings/multi-term-settings
 (gdb) PASS: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: info inferiors
 pid=60422, count=46
 pid=60422, count=47
 pid=60422, count=48
 pid=60422, count=49
 pid=60422, count=50
 pid=60422, count=51
 pid=60422, count=52
 pid=60422, count=53
 pid=60422, count=54
 pid=60422, count=55
 pid=60422, count=56
 pid=60422, count=57
 pid=60422, count=58
 pid=60422, count=59
 pid=60422, count=60
 pid=60422, count=61
 pid=60422, count=62
 pid=60422, count=63
 pid=60422, count=64
 pid=60422, count=65
 pid=60422, count=66
 pid=60422, count=67
 pid=60422, count=68
 pid=60422, count=69
 pid=60404, count=54
 pid=60404, count=55
 pid=60404, count=56
 pid=60404, count=57
 pid=60404, count=58
 PASS: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: continue
 Quit
 (gdb) FAIL: gdb.multi/multi-term-settings.exp: inf1_how=attach: inf2_how=attach: stop with control-c (SIGINT)

If you look at the testcase's sources, you'll see that the intention
is to resumes the program with "continue", wait to see a few of those
"pid=..., count=..." lines, and then interrupt the program with
Ctrl-C.  But somehow, that resulted in GDB printing "Quit", instead of
the Ctrl-C stopping the program with SIGINT.

Here's what is happening:

 #1 - those "pid=..., count=..." lines we see above weren't actually
      output by the inferior after it has been continued (see #1).
      Note that "inf1_how" and "inf2_how" are "attach".  What happened
      is that those "pid=..., count=..." lines were output by the
      inferiors _before_ they were attached to.  We see them at that
      point instead of earlier, because that's where the testcase
      reads from the inferiors' spawn_ids.

 #2 - The testcase mistakenly thinks those "pid=..., count=..." lines
      happened after the continue was processed by GDB, meaning it has
      waited enough, and so sends the Ctrl-C.  GDB hasn't yet passed
      the terminal to the inferior, so the Ctrl-C results in that
      Quit.

The fix here is twofold:

 #1 - flush inferior output right after attaching

 #2 - consume the "Continuing" printed by "continue", indicating the
      inferior has the terminal.  This is the same as done throughout
      the testsuite to handle this exact problem of sending Ctrl-C too
      soon.

gdb/testsuite/ChangeLog:
yyyy-mm-dd  Pedro Alves  <pedro@palves.net <mailto:pedro@palves.net>>

	* gdb.multi/multi-term-settings.exp (create_inferior): Flush
	inferior output.
	(coretest): Use $gdb_test_name.  After issuing "continue", wait
	for "Continuing".

Change-Id: Iba7671dfe1eee6b98d29cfdb05a1b9aa2f9defb9
pipcet pushed a commit that referenced this issue Dec 23, 2021
On openSUSE Tumbleweed with glibc-debuginfo installed I get:
...
 (gdb) PASS: gdb.threads/linux-dp.exp: continue to breakpoint: thread 5's print
 where^M
 #0  print_philosopher (n=3, left=33 '!', right=33 '!') at linux-dp.c:105^M
 #1  0x0000000000401628 in philosopher (data=0x40537c) at linux-dp.c:148^M
 #2  0x00007ffff7d56b37 in start_thread (arg=<optimized out>) \
                          at pthread_create.c:435^M
 #3  0x00007ffff7ddb640 in clone3 () \
                          at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81^M
 (gdb) PASS: gdb.threads/linux-dp.exp: first thread-specific breakpoint hit
...
while without debuginfo installed I get instead:
...
 (gdb) PASS: gdb.threads/linux-dp.exp: continue to breakpoint: thread 5's print
 where^M
 #0  print_philosopher (n=3, left=33 '!', right=33 '!') at linux-dp.c:105^M
 #1  0x0000000000401628 in philosopher (data=0x40537c) at linux-dp.c:148^M
 #2  0x00007ffff7d56b37 in start_thread () from /lib64/libc.so.6^M
 #3  0x00007ffff7ddb640 in clone3 () from /lib64/libc.so.6^M
 (gdb) FAIL: gdb.threads/linux-dp.exp: first thread-specific breakpoint hit
...

The problem is that the regexp used:
...
  "\(from .*libpthread\|at pthread_create\|in pthread_create\)"
...
expects the 'from' part to match libpthread, but in glibc 2.34 libpthread has
been merged into libc.

Fix this by updating the regexp.

Tested on x86_64-linux.
pipcet pushed a commit that referenced this issue Dec 23, 2021
Currently for a binary compiled normally (without -fsanitize=address) but with
LD_PRELOAD of ASAN one gets:

$ ASAN_OPTIONS=detect_leaks=0:alloc_dealloc_mismatch=1:abort_on_error=1:fast_unwind_on_malloc=0 LD_PRELOAD=/usr/lib64/libasan.so.6 gdb
=================================================================
==1909567==ERROR: AddressSanitizer: alloc-dealloc-mismatch (malloc vs operator delete []) on 0x602000001570
    #0 0x7f1c98e5efa7 in operator delete[](void*) (/usr/lib64/libasan.so.6+0xb0fa7)
...
0x602000001570 is located 0 bytes inside of 2-byte region [0x602000001570,0x602000001572)
allocated by thread T0 here:
    #0 0x7f1c98e5cd1f in __interceptor_malloc (/usr/lib64/libasan.so.6+0xaed1f)
    #1 0x557ee4a42e81 in operator new(unsigned long) (/usr/libexec/gdb+0x74ce81)
SUMMARY: AddressSanitizer: alloc-dealloc-mismatch (/usr/lib64/libasan.so.6+0xb0fa7) in operator delete[](void*)
==1909567==HINT: if you don't care about these errors you may set ASAN_OPTIONS=alloc_dealloc_mismatch=0
==1909567==ABORTING

Despite the code called properly operator new[] and operator delete[].
But GDB's new-op.cc provides its own operator new[] which gets translated into
malloc() (which gets recogized as operatore new(size_t)) but as it does not
translate also operators delete[] Address Sanitizer gets confused.

The question is how many variants of the delete operator need to be provided.
There could be 14 operators new but there are only 4, GDB uses 3 of them.
There could be 16 operators delete but there are only 6, GDB uses 2 of them.
It depends on libraries and compiler which of the operators will get used.
Currently being used:
                 U operator new[](unsigned long)
                 U operator new(unsigned long)
                 U operator new(unsigned long, std::nothrow_t const&)
                 U operator delete[](void*)
                 U operator delete(void*, unsigned long)

Tested on x86_64-linux.
pipcet pushed a commit that referenced this issue Dec 23, 2021
This commit fixes Bug 28308, titled "Strange interactions with
dprintf and break/commands":

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28308

Since creating that bug report, I've found a somewhat simpler way of
reproducing the problem.  I've encapsulated it into the GDB test case
which I've created along with this bug fix.  The name of the new test
is gdb.base/dprintf-execution-x-script.exp, I'll demonstrate the
problem using this test case, though for brevity, I've placed all
relevant files in the same directory and have renamed the files to all
start with 'dp-bug' instead of 'dprintf-execution-x-script'.

The script file, named dp-bug.gdb, consists of the following commands:

dprintf increment, "dprintf in increment(), vi=%d\n", vi
break inc_vi
commands
  continue
end
run

Note that the final command in this script is 'run'.  When 'run' is
instead issued interactively, the  bug does not occur.  So, let's look
at the interactive case first in order to see the correct/expected
output:

$ gdb -q -x dp-bug.gdb dp-bug
... eliding buggy output which I'll discuss later ...
(gdb) run
Starting program: /mesquite2/sourceware-git/f34-master/bld/gdb/tmp/dp-bug
vi=0
dprintf in increment(), vi=0

Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26
26	in dprintf-execution-x-script.c
vi=1
dprintf in increment(), vi=1

Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26
26	in dprintf-execution-x-script.c
vi=2
dprintf in increment(), vi=2

Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26
26	in dprintf-execution-x-script.c
vi=3
[Inferior 1 (process 1539210) exited normally]

In this run, in which 'run' was issued from the gdb prompt (instead
of at the end of the script), there are three dprintf messages along
with three 'Breakpoint 2' messages.  This is the correct output.

Now let's look at the output that I snipped above; this is the output
when 'run' is issued from the script loaded via GDB's -x switch:

$ gdb -q -x dp-bug.gdb dp-bug
Reading symbols from dp-bug...
Dprintf 1 at 0x40116e: file dprintf-execution-x-script.c, line 38.
Breakpoint 2 at 0x40113a: file dprintf-execution-x-script.c, line 26.
vi=0
dprintf in increment(), vi=0

Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26
26	dprintf-execution-x-script.c: No such file or directory.
vi=1

Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26
26	in dprintf-execution-x-script.c
vi=2

Breakpoint 2, inc_vi () at dprintf-execution-x-script.c:26
26	in dprintf-execution-x-script.c
vi=3
[Inferior 1 (process 1539175) exited normally]

In the output shown above, only the first dprintf message is printed.
The 2nd and 3rd dprintf messages are missing!  However, all three
'Breakpoint 2...' messages are still printed.

Why does this happen?

bpstat_do_actions_1() in gdb/breakpoint.c contains the following
comment and code near the start of the function:

  /* Avoid endless recursion if a `source' command is contained
     in bs->commands.  */
  if (executing_breakpoint_commands)
    return 0;

  scoped_restore save_executing
    = make_scoped_restore (&executing_breakpoint_commands, 1);

Also, as described by this comment prior to the 'async' field
in 'struct ui' in top.h, the main UI starts off in sync mode
when processing command line arguments:

  /* True if the UI is in async mode, false if in sync mode.  If in
     sync mode, a synchronous execution command (e.g, "next") does not
     return until the command is finished.  If in async mode, then
     running a synchronous command returns right after resuming the
     target.  Waiting for the command's completion is later done on
     the top event loop.  For the main UI, this starts out disabled,
     until all the explicit command line arguments (e.g., `gdb -ex
     "start" -ex "next"') are processed.  */

This combination of things, the state of the static global
'executing_breakpoint_commands' plus the state of the async
field in the main UI causes this behavior.

This is a backtrace after hitting the dprintf breakpoint for
the second time when doing 'run' from the script file, i.e.
non-interactively:

Thread 1 "gdb" hit Breakpoint 3, bpstat_do_actions_1 (bsp=0x7fffffffc2b8)
    at /ironwood1/sourceware-git/f34-master/bld/../../worktree-master/gdb/breakpoint.c:4431
4431	  if (executing_breakpoint_commands)

 #0  bpstat_do_actions_1 (bsp=0x7fffffffc2b8)
     at gdb/breakpoint.c:4431
 #1  0x00000000004d8bc6 in dprintf_after_condition_true (bs=0x1538090)
     at gdb/breakpoint.c:13048
 #2  0x00000000004c5caa in bpstat_stop_status (aspace=0x116dbc0, bp_addr=0x40116e, thread=0x137f450, ws=0x7fffffffc718,
     stop_chain=0x1538090) at gdb/breakpoint.c:5498
 #3  0x0000000000768d98 in handle_signal_stop (ecs=0x7fffffffc6f0)
     at gdb/infrun.c:6172
 #4  0x00000000007678d3 in handle_inferior_event (ecs=0x7fffffffc6f0)
     at gdb/infrun.c:5662
 #5  0x0000000000763cd5 in fetch_inferior_event ()
     at gdb/infrun.c:4060
 #6  0x0000000000746d7d in inferior_event_handler (event_type=INF_REG_EVENT)
     at gdb/inf-loop.c:41
 #7  0x00000000007a702f in handle_target_event (error=0, client_data=0x0)
     at gdb/linux-nat.c:4207
 bminor#8  0x0000000000b8cd6e in gdb_wait_for_event (block=block@entry=0)
     at gdbsupport/event-loop.cc:701
 bminor#9  0x0000000000b8d032 in gdb_wait_for_event (block=0)
     at gdbsupport/event-loop.cc:597
 bminor#10 gdb_do_one_event () at gdbsupport/event-loop.cc:212
 bminor#11 0x00000000009d19b6 in wait_sync_command_done ()
     at gdb/top.c:528
 bminor#12 0x00000000009d1a3f in maybe_wait_sync_command_done (was_sync=0)
     at gdb/top.c:545
 #13 0x00000000009d2033 in execute_command (p=0x7fffffffcb18 "", from_tty=0)
     at gdb/top.c:676
 #14 0x0000000000560d5b in execute_control_command_1 (cmd=0x13b9bb0, from_tty=0)
     at gdb/cli/cli-script.c:547
 #15 0x000000000056134a in execute_control_command (cmd=0x13b9bb0, from_tty=0)
     at gdb/cli/cli-script.c:717
 #16 0x00000000004c3bbe in bpstat_do_actions_1 (bsp=0x137f530)
     at gdb/breakpoint.c:4469
 #17 0x00000000004c3d40 in bpstat_do_actions ()
     at gdb/breakpoint.c:4533
 #18 0x00000000006a473a in command_handler (command=0x1399ad0 "run")
     at gdb/event-top.c:624
 #19 0x00000000009d182e in read_command_file (stream=0x113e540)
     at gdb/top.c:443
 #20 0x0000000000563697 in script_from_file (stream=0x113e540, file=0x13bb0b0 "dp-bug.gdb")
     at gdb/cli/cli-script.c:1642
 #21 0x00000000006abd63 in source_gdb_script (extlang=0xc44e80 <extension_language_gdb>, stream=0x113e540,
     file=0x13bb0b0 "dp-bug.gdb") at gdb/extension.c:188
 #22 0x0000000000544400 in source_script_from_stream (stream=0x113e540, file=0x7fffffffd91a "dp-bug.gdb",
     file_to_open=0x13bb0b0 "dp-bug.gdb")
     at gdb/cli/cli-cmds.c:692
 #23 0x0000000000544557 in source_script_with_search (file=0x7fffffffd91a "dp-bug.gdb", from_tty=1, search_path=0)
     at gdb/cli/cli-cmds.c:750
 #24 0x00000000005445cf in source_script (file=0x7fffffffd91a "dp-bug.gdb", from_tty=1)
     at gdb/cli/cli-cmds.c:759
 #25 0x00000000007cf6d9 in catch_command_errors (command=0x5445aa <source_script(char const*, int)>,
     arg=0x7fffffffd91a "dp-bug.gdb", from_tty=1, do_bp_actions=false)
     at gdb/main.c:523
 #26 0x00000000007cf85d in execute_cmdargs (cmdarg_vec=0x7fffffffd1b0, file_type=CMDARG_FILE, cmd_type=CMDARG_COMMAND,
     ret=0x7fffffffd18c) at gdb/main.c:615
 #27 0x00000000007d0c8e in captured_main_1 (context=0x7fffffffd3f0)
     at gdb/main.c:1322
 #28 0x00000000007d0eba in captured_main (data=0x7fffffffd3f0)
     at gdb/main.c:1343
 #29 0x00000000007d0f25 in gdb_main (args=0x7fffffffd3f0)
     at gdb/main.c:1368
 #30 0x00000000004186dd in main (argc=5, argv=0x7fffffffd508)
     at gdb/gdb.c:32

There are two frames for bpstat_do_actions_1(), one at frame #16 and
the other at frame #0.  The one at frame #16 is processing the actions
for Breakpoint 2, which is a 'continue'.  The one at frame #0 is attempting
to process the dprintf breakpoint action.  However, at this point,
the value of 'executing_breakpoint_commands' is 1, forcing an early
return, i.e. prior to executing the command(s) associated with the dprintf
breakpoint.

For the sake of comparison, this is what the stack looks like when hitting
the dprintf breakpoint for the second time when issuing the 'run'
command from the GDB prompt.

Thread 1 "gdb" hit Breakpoint 3, bpstat_do_actions_1 (bsp=0x7fffffffccd8)
    at /ironwood1/sourceware-git/f34-master/bld/../../worktree-master/gdb/breakpoint.c:4431
4431	  if (executing_breakpoint_commands)

 #0  bpstat_do_actions_1 (bsp=0x7fffffffccd8)
     at gdb/breakpoint.c:4431
 #1  0x00000000004d8bc6 in dprintf_after_condition_true (bs=0x16b0290)
     at gdb/breakpoint.c:13048
 #2  0x00000000004c5caa in bpstat_stop_status (aspace=0x116dbc0, bp_addr=0x40116e, thread=0x13f0e60, ws=0x7fffffffd138,
     stop_chain=0x16b0290) at gdb/breakpoint.c:5498
 #3  0x0000000000768d98 in handle_signal_stop (ecs=0x7fffffffd110)
     at gdb/infrun.c:6172
 #4  0x00000000007678d3 in handle_inferior_event (ecs=0x7fffffffd110)
     at gdb/infrun.c:5662
 #5  0x0000000000763cd5 in fetch_inferior_event ()
     at gdb/infrun.c:4060
 #6  0x0000000000746d7d in inferior_event_handler (event_type=INF_REG_EVENT)
     at gdb/inf-loop.c:41
 #7  0x00000000007a702f in handle_target_event (error=0, client_data=0x0)
     at gdb/linux-nat.c:4207
 bminor#8  0x0000000000b8cd6e in gdb_wait_for_event (block=block@entry=0)
     at gdbsupport/event-loop.cc:701
 bminor#9  0x0000000000b8d032 in gdb_wait_for_event (block=0)
     at gdbsupport/event-loop.cc:597
 bminor#10 gdb_do_one_event () at gdbsupport/event-loop.cc:212
 bminor#11 0x00000000007cf512 in start_event_loop ()
     at gdb/main.c:421
 bminor#12 0x00000000007cf631 in captured_command_loop ()
     at gdb/main.c:481
 #13 0x00000000007d0ebf in captured_main (data=0x7fffffffd3f0)
     at gdb/main.c:1353
 #14 0x00000000007d0f25 in gdb_main (args=0x7fffffffd3f0)
     at gdb/main.c:1368
 #15 0x00000000004186dd in main (argc=5, argv=0x7fffffffd508)
     at gdb/gdb.c:32

This relatively short backtrace is due to the current UI's async field
being set to 1.

Yet another thing to be aware of regarding this problem is the
difference in the way that commands associated to dprintf breakpoints
versus regular breakpoints are handled.  While they both use a command
list associated with the breakpoint, regular breakpoints will place
the commands to be run on the bpstat chain constructed in
bp_stop_status().  These commands are run later on.  For dprintf
breakpoints, commands are run via the 'after_condition_true' function
pointer directly from bpstat_stop_status().  (The 'commands' field in
the bpstat is cleared in dprintf_after_condition_true().  This
prevents the dprintf commands from being run again later on when other
commands on the bpstat chain are processed.)

Another thing that I noticed is that dprintf breakpoints are the only
type of breakpoint which use 'after_condition_true'.  This suggests
that one possible way of fixing this problem, that of making dprintf
breakpoints work more like regular breakpoints, probably won't work.
(I must admit, however, that my understanding of this code isn't
complete enough to say why.  I'll trust that whoever implemented it
had a good reason for doing it this way.)

The comment referenced earlier regarding 'executing_breakpoint_commands'
states that the reason for checking this variable is to avoid
potential endless recursion when a 'source' command appears in
bs->commands.  We know that a dprintf command is constrained to either
1) execution of a GDB printf command, 2) an inferior function call of
a printf-like function, or 3) execution of an agent-printf command.
Therefore, infinite recursion due to a 'source' command cannot happen
when executing commands upon hitting a dprintf breakpoint.

I chose to fix this problem by having dprintf_after_condition_true()
directly call execute_control_commands().  This means that it no
longer attempts to go through bpstat_do_actions_1() avoiding the
infinite recursion check for potential 'source' commands on the
command chain.  I think it simplifies this code a little bit too, a
definite bonus.

Summary:

	* breakpoint.c (dprintf_after_condition_true): Don't call
	bpstat_do_actions_1().  Call execute_control_commands()
	instead.
pipcet pushed a commit that referenced this issue Dec 23, 2021
On Fedora 35,

$ readelf -d /usr/bin/npc

caused readelf to run out of stack since load_separate_debug_info
returned the input main file as the separate debug info:

(gdb) bt
 #0  load_separate_debug_info (
    main_filename=main_filename@entry=0x510f50 "/export/home/hjl/.cache/debuginfod_client/dcc33c51c49e7dafc178fdb5cf8bd8946f965295/debuginfo",
    xlink=xlink@entry=0x4e5180 <debug_displays+4480>,
    parse_func=parse_func@entry=0x431550 <parse_gnu_debuglink>,
    check_func=check_func@entry=0x432ae0 <check_gnu_debuglink>,
    func_data=func_data@entry=0x7fffffffdb60, file=file@entry=0x51d430)
    at /export/gnu/import/git/sources/binutils-gdb/binutils/dwarf.c:11057
 #1  0x000000000043328d in check_for_and_load_links (file=0x51d430,
    filename=0x510f50 "/export/home/hjl/.cache/debuginfod_client/dcc33c51c49e7dafc178fdb5cf8bd8946f965295/debuginfo")
    at /export/gnu/import/git/sources/binutils-gdb/binutils/dwarf.c:11381
 #2  0x00000000004332ae in check_for_and_load_links (file=0x51b070,
    filename=0x518dd0 "/export/home/hjl/.cache/debuginfod_client/dcc33c51c49e7dafc178fdb5cf8bd8946f965295/debuginfo")

Return NULL if the separate debug info is the same as the input main
file to avoid infinite recursion.

	PR binutils/28679
	* dwarf.c (load_separate_debug_info): Don't return the input
	main file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant