Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make trapped_instruction_at() explicitly x86-specific #3847

Merged
merged 1 commit into from
Oct 16, 2024

Conversation

pcc
Copy link
Contributor

@pcc pcc commented Oct 11, 2024

There were some code paths where trapped_instruction_at() was callable on non-x86 platforms; as a result it may have been possible to reach x86-specific handling of trapped instructions in cases where the instruction bytes happened to match. Fix it by adding an architecture check to said code paths, renaming the function to x86_trapped_instruction_at() (likewise for trapped_instruction_len() and TrappedInstruction) and adding an assertion that checks the architecture.

@@ -497,7 +497,7 @@ bool ReplaySession::handle_unrecorded_cpuid_fault(
ReplayTask* t, const StepConstraints& constraints) {
if (t->stop_sig() != SIGSEGV || !has_cpuid_faulting() ||
trace_in.uses_cpuid_faulting() ||
trapped_instruction_at(t, t->ip()) != TrappedInstruction::CPUID) {
x86_trapped_instruction_at(t, t->ip()) != X86TrappedInstruction::CPUID) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what ensures we're x86-only at this point.

I think a slightly better approach would be to keep trapped_instruction_at and the TrappedInstruction enum named as is, but rename the TrappedInstruction enum values with X86_ prefixes, and have trapped_instruction_at check is_x86ish(t->arch()) before returning any of those values. How does that sound?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!has_cpuid_faulting() will be true on non-x86 because CPUID is x86-specific, so we will only reach here on x86.

I considered something like your suggested approach, but the arm64-specific trapped instructions that I plan to introduce for timer trapping (specifically mrs xN, cntvct_el0 etc.) take a register operand unlike the x86 instructions, so we'll need some way to return that back, and I thought that it would be best to do that via a separate function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW has_cpuid_faulting() will actually die if ARCH_SET_CPUID ever returns success on a non-x86 system... ideally we wouldn't depend on that. So let's not: 8deb76b

It's still a little obscure at this call site to require that readers know has_cpuid_faulting() implies x86. It's easy to imagine that might not be the case if we sooner or later support similar functionality on other architectures.

the arm64-specific trapped instructions that I plan to introduce for timer trapping (specifically mrs xN, cntvct_el0 etc.) take a register operand unlike the x86 instructions, so we'll need some way to return that back, and I thought that it would be best to do that via a separate function.

Why not use this function? We can change it to return a struct containing the instruction type and an output register field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still a little obscure at this call site to require that readers know has_cpuid_faulting() implies x86. It's easy to imagine that might not be the case if we sooner or later support similar functionality on other architectures.

Yeah, the current state is not ideal but if we add CPUID trapping support for other architectures this part of the code would need to be changed anyway, so I didn't think that would be a significant problem.

Why not use this function? We can change it to return a struct containing the instruction type and an output register field.

That would be fine with me but I thought it would be slightly clearer to use the type system to express which information is available on each architecture (e.g. X86TrappedInstruction is an enum, ARMTrappedInstruction is an (enum, regno) pair) rather than having a product type for everything on every architecture. That being said it's probably not a big deal as long as we only support two architectures.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be fine with me

Let's do that then.

but I thought it would be slightly clearer to use the type system to express which information is available on each architecture (e.g. X86TrappedInstruction is an enum, ARMTrappedInstruction is an (enum, regno) pair) rather than having a product type for everything on every architecture.

There isn't really any fundamental reason why x86 trappable instructions (which is really a misnomer; e.g. PUSHF isn't trappable, these are just instructions that are important to us) don't have an explicit output register operand. That's just an accident, more or less. Intel likes to use DX:AX for special instructions but they don't always. I think it makes total sense to have a return value struct that represents the opcode and any decoded output operand. And we should probably call it DisassembledSpecialInstruction or something like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

There were some code paths where trapped_instruction_at() was callable
on non-x86 platforms; as a result it may have been possible to reach
x86-specific handling of trapped instructions in cases where the
instruction bytes happened to match.

Moreover we plan to introduce ARM-specific special instruction handling as
part of the fix for rr-debugger#3740. The ARM special instructions take a register
operand unlike the x86 instructions so this will need to return more
than just an enum.

And as pointed out, TrappedInstruction is used not only for instructions
that we trap on but also instructions that we handle specially such
as PUSHF.

Therefore, add an architecture check to trapped_instruction_at()
(now special_instruction_at()), make it return a struct (for now
just containing the opcode enum), rename the existing enumerators to
be prefixed with X86_ and, while we're here, replace "trapped" with
"special".
@rocallahan rocallahan merged commit bf20849 into rr-debugger:master Oct 16, 2024
5 checks passed
@rocallahan
Copy link
Collaborator

Thanks!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants