syscall auto-restart handling does not handle native syscalls: DR, clients, attach #2659

derekbruening · 2017-10-04T04:17:24Z

i#1145 failed to add auto-restart handling for signals that don't enter
record_pending_signal: primarily our own suspend signals, but also nudge
signals. This includes all native interruption points: DR and client C code, app code for attach.

Xref #1145, #1216

Support auto-restart in client C code

So far we've had to explicitly put loops checking for EINTR into client
code where we wouldn't have to natively. Proper handling here would
eliminate that.

Incorrect auto-restart for signals used during attach results in broken apps

If an attach takeover signal interrupts an auto-restart syscall, we'll
break the app. We have seen this on some large apps.

Proposal: pass SA_RESTART to our own handlers!

A simple proposal here is to set SA_RESTART on our own handler. This will
cause the kernel to adjust the sigcontext to restore the original xax and
shift the PC back by 2. This will solve the native cases (attach, client,
DR code) without any further work.

I think the reason #1145 didn't use SA_RESTART was a mis-understanding of
when the kernel adjusts the PC: if the kernel adjusts on sigreturn then it
messes with cases where DR changes the PC. But the kernel adjusts prior to
setting up the frame.

Inlined syscalls, deliver signal immediately and leave context alone

For interrupting an inlined syscall in the cache: deliver the signal
immediately as we do now, just as though the signal arrived just before the
syscall executed. There is no pre- or post-syscall handling to worry
about. No need to change the context: use the one the kernel reset. For a
non-auto-restart syscall, leave that alone too.

For from-dispatch syscalls, have to undo the kernel's reset and then emulate

For interrupting do_syscall: first check get_at_syscall() to confirm we
executed the pre-syscall part. If not, handle normally. If so, we either
have to go back to dispatch to deliver the signal, in which case we may as
well undo the pc-2 and xax restore and use our auto-restart emulation code,
or we deliver the signal immediately and have problems w/ cross-syscall
state (just like "CLS" on Windows). Having to keep emulating seems
backward but it does seem like the simplest solution? Or should we try to
call handle_post_system_call() and then deliver the signal immediately?

For delivering immediately, xref i#1206 "must deliver signal received in
do_syscall immediately". We canceled that b/c w/ the actual interruption
w/o SA_RESTART we didn't need it.

w/o doing anything we get a HANG on the eintr-noinline test

It records the signal, and returns -- but that return restarts the
syscall, which blocks, and we then never deliver the signal.

-ignore_syscalls doesn't hang b/c:

record_pending_signal(23) from cache pc 0x000000005535b044
        delaying until exit F2033
signal interrupted pre/post syscall itself so delivering now

We can now live w/ being inaccurate at identifying auto-restart syscalls

Our current check just looks at the syscall number which is not enough.
But with SA_RESTART, the kernel tells us which ones are auto-restart, and
we only need our own check for i#1216's nops -- and for the nops we can be
conservative and insert nops where the syscall might be auto-restart.

Transparency of post-syscall handler

Re: is it un-transparent to have EINTR returned for an auto-restart
syscall: since no app code is run and it's just DR and client-visible state
it seeems ok. I guess a client counting syscall returns would have it
wrong.

Avoiding this seems hard: we'd have to stack dcontexts like on Windows.

We can just pass in the xax/r0 value to restore

We can avoid storing it into TLS on ARM, and decoding on x86.

is_after_syscall checks in synch, etc. have to handle pc==syscall

We have to distinguish the case of not yet having executed the syscall yet,
from having executed it but it got interrupted with the pc set backward.
If we only have to do this for gencode handled from dispatch,
get_at_syscall() will work.

w/ SA_RESTART, can we remove all the emulation code? and the nop insertion? => no

No: nop insertion is for non-auto-restart syscalls, and the emulation is
needed to get back to dispatch. We can make it more accurate.

What about sysenter restart: xref us re-copying xsp into xbp

Add a test w/ a client doing an eintr.c-type syscall

The text was updated successfully, but these errors were encountered:

derekbruening · 2017-10-05T03:53:53Z

A big challenge here is sysenter on recent kernels where the syscall restart point is an int 0x80 rather than a jmp back up to the mov esp,ebp;sysenter: the problem is that the int 0x80's fallthrough is our hook jmp, meaning we can't leave control at the syscall restart point when going native and we have to be careful not to interpret our own hook.

Changes the strategy for handling interrupted syscalls to use SA_RESTART, which adds easy handling of interruptions of auto-restart syscalls in native code, such as during attach or in client C code. This also simplifies auto-restart of app syscalls, as we now have foolproof identification of auto-restart situations (before our syscall-number-based check was inaccurate) and the kernel has restored the clobbered register value for us. That eliminates the TLS store we were doing on ARM and the decode on x86. For fragment-inlined syscalls, we don't need to do anything anymore: we deliver the signal immediately and the kernel has already set up the proper resumption state. For gencode syscalls, for sane post-syscall handling, we leverage i#1145's auto-restart emulation and undo what the kernel did. This lets us go back to dispatch for a clean sequence of events. Updates synch and translation checks for post-syscall to also account for syscall-restart locations. For sysenter, this change means we now see threads at the int 0x80 restart point whose fall-through is our vsyscall hook. To avoid interpreting our own hook we map that PC to our displaced code, and mapping back in dr_fragment_app_pc(). Fixes a race between going native and unhooking the vsyscall (to avoid takeover problems) by redirecting threads at vsyscall or the int 0x80 restart to our own gencode. Adds a test of a client making a blocking auto-restart syscall to ensure it is not terminated with EINTR. This was not simple to arrange in a non-racy manner and required adding a new feature of immediately delivering a signal that arrives in DR or client code where the receiving thread is at a safe point (because if we delay, the test cannot ensure the signal was received). Fixes #2659

With SA_RESTART (#2659), we cannot distinguish not-yet-executed-syscall from syscall-was-interrupted-in-the-kernel. This doesn't matter for most syscalls, but it does matter for sigreturn, whose asynch_target points somewhere other than right after the syscall, so we exclude it (it can't be interrupted so we know we haven't executed it yet). Fixes #2995

Avoids translating a new signal while suspended, as the resulting lock acquisitions can lead to deadlock. An incoming signal there should always be delayable in any case: the only reason we were trying to deliver immediately was due to a too-broad check put in for #2659. It is not easy to create a test that reliably exercises these races: our existing detach_spawn and other tests are the closest and this should reduce their flakiness. Fixes #3026

derekbruening added Bug-AppFail OpSys-Linux labels Oct 4, 2017

derekbruening self-assigned this Oct 4, 2017

derekbruening mentioned this issue Oct 4, 2017

ASSERT core/dynamo.c:1139 on running sol on Ubuntu #1216

Closed

derekbruening mentioned this issue Oct 6, 2017

i#2659 syscall restart: switch to using SA_RESTART #2660

Merged

derekbruening closed this as completed in #2660 Oct 6, 2017

derekbruening mentioned this issue Oct 27, 2017

sysenter_is_int80 tests failing after SA_RESTART change #2674

Closed

derekbruening mentioned this issue Jan 17, 2018

HANG delivering signal that interrupted DR at "safe spot" #2805

Closed

This was referenced May 14, 2018

i#2995: do not blindly "restart" sigreturn #3007

Merged

ASSERT signal.c:3728 decode() != NULL && instr_is_syscall() #2995

Closed

derekbruening mentioned this issue May 24, 2018

hang during detach due to xl8 of signal arriving while suspended #3026

Closed

derekbruening mentioned this issue May 24, 2018

i#3026: do not translate new signals while suspended #3027

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

syscall auto-restart handling does not handle native syscalls: DR, clients, attach #2659

syscall auto-restart handling does not handle native syscalls: DR, clients, attach #2659

derekbruening commented Oct 4, 2017 •

edited

Loading

derekbruening commented Oct 5, 2017

syscall auto-restart handling does not handle native syscalls: DR, clients, attach #2659

syscall auto-restart handling does not handle native syscalls: DR, clients, attach #2659

Comments

derekbruening commented Oct 4, 2017 • edited Loading

Support auto-restart in client C code

Incorrect auto-restart for signals used during attach results in broken apps

Proposal: pass SA_RESTART to our own handlers!

Inlined syscalls, deliver signal immediately and leave context alone

For from-dispatch syscalls, have to undo the kernel's reset and then emulate

w/o doing anything we get a HANG on the eintr-noinline test

We can now live w/ being inaccurate at identifying auto-restart syscalls

Transparency of post-syscall handler

We can just pass in the xax/r0 value to restore

is_after_syscall checks in synch, etc. have to handle pc==syscall

w/ SA_RESTART, can we remove all the emulation code? and the nop insertion? => no

What about sysenter restart: xref us re-copying xsp into xbp

Add a test w/ a client doing an eintr.c-type syscall

derekbruening commented Oct 5, 2017

derekbruening commented Oct 4, 2017 •

edited

Loading