Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#38: Attach injection on Linux #5019

Merged
merged 26 commits into from
Aug 16, 2021
Merged

Conversation

M3m3M4n
Copy link
Contributor

@M3m3M4n M3m3M4n commented Jul 20, 2021

Improve upon pull request #3328:

@derekbruening
Copy link
Contributor

Looks like code style violations are causing the test failures

Copy link
Contributor

@derekbruening derekbruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. If you could go through the #3328 review comments it looks like most of them still apply here.

core/CMakeLists.txt Outdated Show resolved Hide resolved
@M3m3M4n M3m3M4n changed the title i#38: Attach injection on Linux i#38: Improve Attach injection on Linux & Windows Jul 23, 2021
@M3m3M4n
Copy link
Contributor Author

M3m3M4n commented Jul 25, 2021

I need some help to finish this PR on windows since I mainly work on linux

@M3m3M4n
Copy link
Contributor Author

M3m3M4n commented Jul 25, 2021

Few test are failing on vs2017 but I'm not quite sure why

@derekbruening
Copy link
Contributor

Few test are failing on vs2017 but I'm not quite sure why

The logs show a lot of timeouts: something is making them hang. Unfortunately the log does not give enough info. Either tmate would have to be used https://dynamorio.org/page_test_suite.html#autotoc_md261 or a local repro or maybe we can spot something in the patch.

@derekbruening
Copy link
Contributor

The diff displayed here is massive: 27 files changed! Looks like changes that are already in the repo: something was off about the merge that makes the PR diff display show those as part of this change? Plus the auto-merge check above says there are conflicts in core/ir/aarch64/codec.txt. If you could try to re-merge to get the diff to show just the new changes we can do another review round.

summershrimp and others added 9 commits July 27, 2021 10:36
For linux, we use ptrace() to inject dynamorio to running process.
For windows, we get target process handle using DebugActiveProcess()
and inject with original dr_inject_process_inject().
@M3m3M4n
Copy link
Contributor Author

M3m3M4n commented Jul 27, 2021

I accidentally squashed a master merge into a commit. Everything look nice after rebasing.

@M3m3M4n
Copy link
Contributor Author

M3m3M4n commented Jul 28, 2021

Also, can I request some clarification?

Testing ptrace inject with a toy program that does fork() then wait(). I found out that dynamorio cannot inject until after wait() has completed. This is the same with all blocking syscall I've tested with like accept...
Tracing back to injector.c -> inject_ptrace()

if (!ptrace_singlestep(info->pid)) return false;

I thought the PTRACE_ATTACH prior would send SIGSTOP and force any running blocking syscall to return EINTR back to userland? But seems like the blocking syscall continues to run and single stepping only returns after the syscall has completed.
(like forked child exit for wait and new connection for accept)

So to stop dynamorio from attaching I could just do something like wait and never let it return?

@derekbruening
Copy link
Contributor

I thought the PTRACE_ATTACH prior would send SIGSTOP and force any running blocking syscall to return EINTR back to userland? But seems like the blocking syscall continues to run and single stepping only returns after the syscall has completed.
(like forked child exit for wait and new connection for accept)

I don't know w/o further investigation -- maybe @summershrimp has some input here

@summershrimp
Copy link
Contributor

I thought the PTRACE_ATTACH prior would send SIGSTOP and force any running blocking syscall to return EINTR back to userland? But seems like the blocking syscall continues to run and single stepping only returns after the syscall has completed.
(like forked child exit for wait and new connection for accept)

I don't know w/o further investigation -- maybe @summershrimp has some input here

Something strange things happens if you modified PC(IP) register when the target process stuck at blocking syscall , the ip will go 4 bytes more than we expected. So I do a singlestep to make sure the process is not at a blocking syscall .

@summershrimp
Copy link
Contributor

summershrimp commented Jul 29, 2021

Actually it's better to singlestep till the PC is at any text segment. If PC is at vdso, we still can't inject DR code.
And it's better to use the entrypoint of target process to inject codes rather than current pc, since the entrypoint of most program is only run once.

pc = (app_pc)ALIGN_BACKWARD(regs.REG_PC_FIELD, PAGE_SIZE);

@M3m3M4n
Copy link
Contributor Author

M3m3M4n commented Jul 29, 2021

I think we should put note about ptrace_inject-ing to process in middle of blocking syscall somewhere since dynamorio might have to wait indefinitely for the call to complete. And we can't force syscall to return by sending signal either.

@derekbruening
Copy link
Contributor

I think we should put note about ptrace_inject-ing to process in middle of blocking syscall somewhere since dynamorio might have to wait indefinitely for the call to complete. And we can't force syscall to return by sending signal either.

Sure, to start with: but I would expect a way around this since a debugger is able to attach in such a situation right?

@M3m3M4n
Copy link
Contributor Author

M3m3M4n commented Jul 30, 2021

I think we should put note about ptrace_inject-ing to process in middle of blocking syscall somewhere since dynamorio might have to wait indefinitely for the call to complete. And we can't force syscall to return by sending signal either.

Sure, to start with: but I would expect a way around this since a debugger is able to attach in such a situation right?

I did take a quick look into GDB source. Turn out while under ptrace, SIGSTOP only stop only 1 thread of the group. GDB handles this by sending SIGSTOP to all thread by tkill.

However blocking syscalls still block after continuing. But...

While stopping at the thread that issue wait() syscall, the PC always points to the instruction right after syscall instruction. Interestingly, if I put a breakpoint at that instruction address then continue, the syscall failed (or interrupted) then return while GDB reports the program receives SIGINT? This does not happen if I continue or single stepping without the breakpoint. Not quite sure what happens here because single stepping also generate a temporary breakpoint there AFAIK.

Maybe because some how GDB send SIGINT to the test program and ptrace captured it but not forward to tracee? Need more investigation but we can use this behavior to inject after syscall return.

@M3m3M4n
Copy link
Contributor Author

M3m3M4n commented Aug 6, 2021

Bug:
Interrupting syscall this way seem to return errno 512 (maybe leaked ERESTARTSYS from kernel?)
Should resolve / fake this somehow...

only allow ptrace attach to skip syscall with x86 as lack of testing in arm/aarch64
@M3m3M4n
Copy link
Contributor Author

M3m3M4n commented Aug 7, 2021

@derekbruening I think I am done with this PR for now
Attaching now fully functional with X86 linux:

  • Ptrace attach can work even if we are within blocking syscalls. Default behavior is wait for blocking syscalls to complete.
  • I believe errno is leaked from kernel side as we ptrace, so it is masked with EINTR.

Attaching code for ARM/AARCH64 linux can be implemented in a similar way, I might open another PR later when I have time.

@derekbruening
Copy link
Contributor

Something strange things happens if you modified PC(IP) register when the target process stuck at blocking syscall , the ip will go 4 bytes more than we expected. So I do a singlestep to make sure the process is not at a blocking syscall .

@summershrimp -- but if the process PC is at the start of a blocking syscall (maybe b/c it's an auto-restart syscall we interrupted), won't a singlestep block and essentially hang our attach?

Actually it's better to singlestep till the PC is at any text segment. If PC is at vdso, we still can't inject DR code.

If you could elaborate here: why does the current PC matter? I would expect we should be able to set the PC to our shellcode to inject our library and then have our library start execution wherever.

And it's better to use the entrypoint of target process to inject codes rather than current pc, since the entrypoint of most program is only run once.

But we're attaching to a running process who presumably is long past the entry point and will never return there?

Copy link
Contributor

@derekbruening derekbruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the PC changes: I don't see why we would need nops in DR code when we're changing PC values in app code only; I think these PC changes would break auto-restart syscalls in an app.

core/lib/dr_inject.h Outdated Show resolved Hide resolved
core/lib/dr_inject.h Outdated Show resolved Hide resolved
core/lib/dr_inject.h Outdated Show resolved Hide resolved
core/unix/injector.c Outdated Show resolved Hide resolved
core/unix/injector.c Show resolved Hide resolved
core/unix/injector.c Show resolved Hide resolved
core/unix/injector.c Show resolved Hide resolved
core/unix/injector.c Show resolved Hide resolved
core/unix/injector.c Show resolved Hide resolved
tools/drdeploy.c Show resolved Hide resolved
core/arch/x86/x86.asm Outdated Show resolved Hide resolved
core/lib/dr_inject.h Outdated Show resolved Hide resolved
core/unix/injector.c Show resolved Hide resolved
core/unix/injector.c Show resolved Hide resolved
M3m3M4n and others added 3 commits August 15, 2021 01:03
Fix typo; add period at end of sentence.
Fix spelling
Copy link
Contributor

@derekbruening derekbruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the patch and working through the review process.

@derekbruening
Copy link
Contributor

run arm tests

@derekbruening
Copy link
Contributor

add to whitelist

@derekbruening
Copy link
Contributor

run arm tests

@derekbruening
Copy link
Contributor

@AssadHashmi -- does the "add to whitelist" not work for a PR from a forked repository?

@derekbruening derekbruening merged commit bf246bf into DynamoRIO:master Aug 16, 2021
@derekbruening
Copy link
Contributor

@M3m3M4n -- we forgot to update the release notes list of new features in api/docs/release.dox. That can be added as part of the PR that adds a regression test.

@AssadHashmi
Copy link
Contributor

@AssadHashmi -- does the "add to whitelist" not work for a PR from a forked repository?

Hmm...it should do. I've increased your privilege level in the relevant ACL config.
Let me know if add to whitelist doesn't work for you in future. For now I've manually enabled @M3m3M4n for testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants