ACP: `core::arch::breakpoint` #491

joshtriplett · 2024-11-20T14:21:57Z

Proposal

Problem statement

Sometimes, for debugging, users want to have a software breakpoint instruction to use with their debugger, or to generate a core dump for subsequent analysis.

core::intrinsics::breakpoint() exists, but intrinsics are perma-unstable.

Users can manually emit a breakpoint instruction using inline assembly, such as core::arch::asm!("int3") on x86, or core::arch::asm!("brk #0xf000") on ARM. However, this isn't portable.

Solution sketch

In core::arch:

/// Compiles to a target-specific software breakpoint instruction or equivalent.
///
/// This will typically abort the program. It may result in a core dump, and/or the system logging
/// debug information. Additional target-specific capabilities may be possible depending on
/// debuggers or other tooling; in particular, a debugger may be able to resume execution.
///
/// If possible, this will produce an instruction sequence that allows a debugger to resume *after*
/// the breakpoint, rather than resuming *at* the breakpoint; however, the exact behavior is
/// target-specific and debugger-specific, and not guaranteed.
///
/// If the target platform does not have any kind of debug breakpoint instruction, this may compile
/// to a trapping instruction (e.g. an undefined instruction) instead, or to some other form of
/// target-specific abort that may or may not support convenient resumption.
///
/// The precise behavior and the precise instruction generated are not guaranteed, except that in
/// normal execution with no debug tooling involved this will not continue executing.
///
/// - On x86 targets, this produces an `int3` instruction.
/// - On aarch64 targets, this produces a `brk #0xf000` instruction.
#[inline(always)]
pub fn breakpoint() {
    unsafe {
        core::intrinsics::breakpoint();
    }
}

Note that this should not be noreturn (-> !), because on some targets and environments, the user may be able to continue execution from the breakpoint in a debugger.

Links and related work

The unbug crate provides macros that emit breakpoints (e.g. for assertions), but it depends on nightly Rust.

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

We think this problem seems worth solving, and the standard library might be the right place to solve it.
We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.

The text was updated successfully, but these errors were encountered:

programmerjake · 2024-11-20T20:10:36Z

what happens if an ISA doesn't have a breakpoint instruction? (e.g. wasm iirc)

Amanieu · 2024-11-20T21:46:06Z

I'm concerned about portability: this works on x86 because, after handling an int3, the program counter will end up pointing to the instruction after the breakpoint. However this is not the case for other architectures, for example an AArch64 BRK will keep the program counter pointing at the BRK after it is handled, and you have to manually skip the instruction.

joshtriplett · 2024-11-20T22:13:49Z

@Amanieu wrote:

I'm concerned about portability: this works on x86 because, after handling an int3, the program counter will end up pointing to the instruction after the breakpoint. However this is not the case for other architectures, for example an AArch64 BRK will keep the program counter pointing at the BRK after it is handled, and you have to manually skip the instruction.

This is entirely the problem of the debugger to deal with; it doesn't make the Rust program non-portable. (Also, I've recently learned that one standard workaround for that in x86 is to use int3; nop.)

joshtriplett · 2024-11-20T22:25:01Z

@programmerjake wrote:

what happens if an ISA doesn't have a breakpoint instruction? (e.g. wasm iirc)

WebAssembly appears to implement the LLVM llvm.debugtrap intrinsic, and maps it to the instruction unreachable.

More generally: I would expect that there's always some way to trap, or failing that to abort, and that worst case it'll map to whatever unreachable! or assert! uses to bail out. If an architecture truly didn't have anything for that, the Rust target for it would have bigger problems.

Amanieu · 2024-11-20T23:39:59Z

@joshtriplett I think you misunderstand, if you try to continue after a breakpoint instruction:

on x86 this will continue execution after the int3 instruction, as you expect.
on other architectures this will resume execution before the breakpoint instruction, which will simply trigger the breakpoint again. You have to manually modify the PC in the debugging to skip past the instruction.
- Effectively, the breakpoint instruction is treated like any other faulting instruction, such as a faulting load.

Given that this only produces the expected behavior on x86, I don't think we can reasonably expose this as a platform-independent intrinsic.

joshtriplett · 2024-11-21T00:20:38Z

You have to manually modify the PC in the debugging to skip past the instruction.

I understood that; my point is, dealing with that kind of variation is the job of a debugger. Some debuggers do recognize, for instance, the specific aarch64 brk produced by LLVM's __builtin_debugtrap() and automatically skip over it when hit.

Given that this only produces the expected behavior on x86, I don't think we can reasonably expose this as a platform-independent intrinsic.

LLVM and C++ both have a platform-independent intrinsic for this.

The platform-independent behavior is "this will trap, stopping execution; it may result in a core dump; a debugger may treat this as a breakpoint".

BrainBacon · 2024-11-21T00:35:20Z

From my experiments in the Unbug crate running in VSCode I've noticed that brk #1 is insufficient on Apple silicon. That resulted in getting stuck on the breakpoint. However, it looks like __builtin_debugtrap() in LLVM uses brk #0xF000 which will allow the debugger to continue. I've also noticed that a similar nop trick was necessary to get the debugger to land on the correct statement, in my case brk #0xF000 \n nop. The newline (not a semicolon) was necessary using core::arch::asm!.

programmerjake · 2024-11-21T01:18:08Z

@programmerjake wrote:

what happens if an ISA doesn't have a breakpoint instruction? (e.g. wasm iirc)

WebAssembly appears to implement the LLVM llvm.debugtrap intrinsic, and maps it to the instruction unreachable.

ok, I had assumed unreachable wasn't usable since I don't expect a debugger to be able to continue after hitting it...

More generally: I would expect that there's always some way to trap, or failing that to abort,

Yeah I assumed the desired semantics were that a debugger would always be able to continue after hitting the breakpoint, however lowering it to an abort-like thing means that isn't really possible.

joshtriplett · 2024-11-21T04:57:35Z

Yeah I assumed the desired semantics were that a debugger would always be able to continue after hitting the breakpoint, however lowering it to an abort-like thing means that isn't really possible.

The desired semantics are that it traps, in a target-specific way, which might dump core and might be possible to continue if you have a debugger attached, but the details will be target-specific and debugger-specific.

Amanieu · 2024-11-22T10:27:40Z

The desired semantics are that it traps, in a target-specific way, which might dump core and might be possible to continue if you have a debugger attached, but the details will be target-specific and debugger-specific.

You can achieve this in a portable way (at least UNIX) and which works with resuming execution in the debugger by calling raise(SIGTRAP).

joshtriplett · 2024-11-25T22:14:23Z

@Amanieu That's not portable, though, and you can't do it using exclusively core.

Amanieu · 2024-11-26T20:04:55Z

Accepted pending to documentation changes that were discussed in the meeting.

joshtriplett · 2024-11-26T20:14:41Z

I've updated the documentation.

Approved in [ACP 491](rust-lang/libs-team#491).

Approved in [ACP 491](rust-lang/libs-team#491). Remove the `unsafe` on `core::intrinsics::breakpoint()`, since it's a safe intrinsic to call and has no prerequisites.

@zachs18