Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Divergence with tcmalloc on arm64 #3740

Closed
pcc opened this issue Apr 26, 2024 · 2 comments · Fixed by #3852
Closed

Divergence with tcmalloc on arm64 #3740

pcc opened this issue Apr 26, 2024 · 2 comments · Fixed by #3852

Comments

@pcc
Copy link
Contributor

pcc commented Apr 26, 2024

I'm seeing the following divergence while replaying a tcmalloc-utilizing program on arm64:

[FATAL src/ReplaySession.cc:1226:check_ticks_consistency()]
 (task 2944657 (rec:2944634) at time 424)
 -> Assertion `ticks_now == trace_ticks' failed to hold. ticks mismatch for 'SIGNAL: SIGSEGV(det)'; expected 10014507, got 10014509

I suspect this to be caused by accesses to CNTVCT_EL0 in the tcmalloc code. Unfortunately the kernel does not support trapping on count register access on arm64:

prctl(PR_SET_TSC, PR_TSC_SIGSEGV)       = -1 EINVAL (Invalid argument)

It would be possible for the kernel to configure the CPU to trap on this access by clearing CNTKCTL_EL1.EL0VCTEN.

@pcc
Copy link
Contributor Author

pcc commented Apr 27, 2024

@pcc
Copy link
Contributor Author

pcc commented Apr 27, 2024

(I also confirmed that it's CNTVCT_EL0 -- if I nop out the MRS instruction in the binary I can no longer reproduce the divergence.)

pcc added a commit to pcc/rr that referenced this issue Oct 15, 2024
There were some code paths where trapped_instruction_at() was callable
on non-x86 platforms; as a result it may have been possible to reach
x86-specific handling of trapped instructions in cases where the
instruction bytes happened to match.

Moreover we plan to introduce ARM-specific special instruction handling as
part of the fix for rr-debugger#3740. The ARM special instructions take a register
operand unlike the x86 instructions so this will need to return more
than just an enum.

And as pointed out, TrappedInstruction is used not only for instructions
that we trap on but also instructions that we handle specially such
as PUSHF.

Therefore, add an architecture check to trapped_instruction_at()
(now special_instruction_at()), make it return a struct (for now
just containing the opcode enum), rename the existing enumerators to
be prefixed with X86_ and, while we're here, replace "trapped" with
"special".
rocallahan pushed a commit that referenced this issue Oct 16, 2024
There were some code paths where trapped_instruction_at() was callable
on non-x86 platforms; as a result it may have been possible to reach
x86-specific handling of trapped instructions in cases where the
instruction bytes happened to match.

Moreover we plan to introduce ARM-specific special instruction handling as
part of the fix for #3740. The ARM special instructions take a register
operand unlike the x86 instructions so this will need to return more
than just an enum.

And as pointed out, TrappedInstruction is used not only for instructions
that we trap on but also instructions that we handle specially such
as PUSHF.

Therefore, add an architecture check to trapped_instruction_at()
(now special_instruction_at()), make it return a struct (for now
just containing the opcode enum), rename the existing enumerators to
be prefixed with X86_ and, while we're here, replace "trapped" with
"special".
pcc added a commit to pcc/rr that referenced this issue Oct 17, 2024
This works using the recently added support for prctl(PR_SET_TSC) on
arm64 which is due to be released in kernel version 6.12.

Fixes rr-debugger#3740
pcc added a commit to pcc/rr that referenced this issue Oct 17, 2024
This works using the recently added support for prctl(PR_SET_TSC) on
arm64 which is due to be released in kernel version 6.12.

Fixes rr-debugger#3740
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant