-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PAL/Linux-SGX] Add AEX-Notify support in exception handling flow #1531
base: master
Are you sure you want to change the base?
Conversation
Add support of adding `sgx.enable_aex_notify = true` to manifest. When AEX-Notify is enabled, will not exit the enclave during the stage-1 exception handler. Instead, a new instruct EDECCSSA will be triggered to switch to SSA0 and directly jump to stage-2 exception handler. Co-authored-by: Gu, Junjun <junjun.gu@intel.com> Co-authored-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com> Co-authored-by: Kailun Qin <kailun.qin@intel.com> Signed-off-by: Gu, Junjun <junjun.gu@intel.com> Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com> Signed-off-by: Kailun Qin <kailun.qin@intel.com> Signed-off-by: Zhang, Lili Z <lili.z.zhang@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 5 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @lzha101)
pal/src/host/linux-sgx/host_entry.S
line 86 at r1 (raw file):
movq %rsp, %rbp .cfi_def_cfa_register %rbp subq $RED_ZONE_SIZE, %rsp
Does this really work? When I played with similar modifications in #1944, I ended up adding the red zone calculation to "before any RSP-changing instruction", which is pushq %rax
above.
That's because the red zone from the "normal execution" of the untrusted Gramine runtime is 128 bytes below the RSP at the point of async_exit_pointer
. So the "normal execution" may use these bytes for its own local purposes. And without the immediate red zone subtraction, this code (these pushq %rax
and co) overrides the local variables of "normal execution", and Gramine breaks.
See how I changed the code in #1944.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: all files reviewed, 23 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @lzha101)
-- commits
line 8 at r1:
This commit message will need to be reworked. It should be much more verbose, explaining all the flows added in this PR.
pal/src/host/linux-sgx/enclave_entry.S
line 229 at r1 (raw file):
# definitely-malicious exception (correct untrusted runtime would never generate "0" event) FAIL_LOOP
This change is not relevant in the latest code on master
, should be removed on rebase
pal/src/host/linux-sgx/enclave_entry.S
line 268 at r1 (raw file):
movq %rdx, %rbx # upon EENTER the exit address was in RDX, mov it to RBX for EEXIT movq $EEXIT, %rax enclu
This is a simplification for this corner case, and is unrelated to this PR. Please extract to a separate PR.
pal/src/host/linux-sgx/enclave_entry.S
line 549 at r1 (raw file):
# copy the whole SSA[0].XSAVE region to the CPU context's XSAVE on stack; # __restore_xregs / __save_xregs clobber RDX so need to stash it in R10 movq %rdx, %r10
If I understand this correctly, we replace RBX with R10 because we will still need RBX (the base to the context/registers of SSA0) further in AEX-Notify flows. So we replace RBX with some register that is unused, which happens to be R10.
I would extract this change also to a separate PR.
pal/src/host/linux-sgx/enclave_entry.S
line 621 at r1 (raw file):
# finally jump to the stage-2 C exception handler .Lfinalize_context_after_deccssa_inst0: jmp *%r11
Even though an async signal can arrive at this exact jmp
instruction, this should be safe. Because the SSA0 can be considered having the correct and consistent context for the "stage-2 exception handler", including this R11 register which points to the stage-2 exception handling routine in our C code.
That's just info for other reviewers and authors. I hope I understand this corner-case data race correctly.
pal/src/host/linux-sgx/enclave_entry.S
line 658 at r1 (raw file):
.cfi_offset %rbx, -24 # Disable aex-notify before stack switch
Why do we need to do this? This part of code I don't understand.
What bad thing can happen during OCALL preparation?
pal/src/host/linux-sgx/enclave_entry.S
line 823 at r1 (raw file):
1: # check if it is ready to enable AEX-Notify
Why do we need to do this? This part of code I don't understand.
What bad thing can happen during OCALL finalization?
pal/src/host/linux-sgx/enclave_entry.S
line 844 at r1 (raw file):
# there was some event, call _PalHandleExternalEvent(event, uc, xregs_state) # Note we don't check AEX-Notify here. If we disable it during ocall-exit and # the execution comes below code, the aex-notify will be enabled after event handling
Again, I don't understand the comment. Why do we care about OCALLs?
pal/src/host/linux-sgx/host_entry.S
line 86 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Does this really work? When I played with similar modifications in #1944, I ended up adding the red zone calculation to "before any RSP-changing instruction", which is
pushq %rax
above.That's because the red zone from the "normal execution" of the untrusted Gramine runtime is 128 bytes below the RSP at the point of
async_exit_pointer
. So the "normal execution" may use these bytes for its own local purposes. And without the immediate red zone subtraction, this code (thesepushq %rax
and co) overrides the local variables of "normal execution", and Gramine breaks.See how I changed the code in #1944.
Ok, I fixed a similar issue in #1944, and it was copy-pasted into #1857. So after one of those PRs is merged, we can rebase and get rid of these things here.
pal/src/host/linux-sgx/host_entry.S
line 95 at r1 (raw file):
#endif # Check if there are sync/async signals pending and invoke in-enclave stage-1 handler if any
What this comment doesn't mention is:
- If AEX Notify was enabled in the enclave thread, then the enclave never returns from the stage-1 handler invocation in untrusted runtime and the following code in this asm file is not executed.
- If AEX Notify was disabled/not used in the enclave thread, then the enclave returns from the stage-1 handler invocation to this code in this asm file, so we need to prepare the ERESUME flow.
Please add these or similar comments here.
pal/src/host/linux-sgx/host_entry.S
line 114 at r1 (raw file):
# In case of normal ERESUME, RDI is not used; # In case of ERESUME-morphed-into-EENTER, RDI is external event in flow .Lcssa1_exception movq $PAL_EVENT_INTERRUPTED, %rdi
This looks correct but needs more comments.
IIUC, we get to this line of code only in following cases:
- No AEX-Notify: then ERESUME never morphs into EENTER, so the value in RDI is ignored. This code snippet becomes a no-op.
- AEX-Notify enabled on this enclave thread: previously we called
sgx_handle_aex_signal()
which would raise a signal (if any) into the enclave, and AEX-Notify would ensure thatsgx_handle_aex_signal()
does not return (but instead performs EDECCSSA inside the enclave and stays there). Therefore, the only way we could arrive to this line of code is that there was no pending signal, andsgx_handle_aex_signal()
didn't do anything. This means that ERESUME will morth into EENTER, and we need to specify RDI (external event). And since there was no external event really, we put a kinda dummyPAL_EVENT_INTERRUPTED
(aka SIGCONT). So we survive all the checks inside the SGX enclave, because SIGCONT can always arrive and is benign.
Well, actually, doesn't this introduce a heavy performance hit? On each AEX (without any real signals), we will call into the stage-1 handler, then stage-2 handler will handle dummy SIGCONT, and only then the execution will continue. Sounds like a lot of wasted CPU cycles.
pal/src/host/linux-sgx/host_entry.S
line 124 at r1 (raw file):
eresume_pointer: # perform ERESUME (RAX already contains "ERESUME" because that's what AEX hardware flow does) enclu
This is also unrelated to this PR, so could be extracted in a separate PR (together with the other ones I proposed)
pal/src/host/linux-sgx/host_entry.S
line 140 at r1 (raw file):
# other arguments: RDI - event (sync or async signal) # below logic is the same as for sgx_ecall(), see comments for that function
Looks like the only point of this duplicated code is to have global sgx_raise_eenter_instr
on ENCLU, so that we can use this global symbol in the handle_sync_signal()
check.
This is a waste of assembly. Let's just add the global symbol like global sgx_eenter_instr
on the .Ldo_ecall
code path, and use that one. Since normal EENTER never fails, this should be sufficient. But if you want to double-check that it's exactly during the sgx_raise()
flow, you can probably put some magic value in one of the GPRs (or in a new thread-local variable), and check it additionally.
pal/src/host/linux-sgx/host_exception.c
line 96 at r1 (raw file):
} static bool interrupted_in_aex(void) {
This is unrelated rename and can be extracted into a separate PR (together with the other ones I mentioned elsewhere)
pal/src/host/linux-sgx/host_exception.c
line 121 at r1 (raw file):
if (interrupted_in_enclave(uc)) { /* exception happened in app/LibOS/trusted PAL code, mark to handle signal inside enclave */ assert(pal_get_host_tcb()->aex_sync_event == PAL_EVENT_NO_EVENT);
I would actually replace this assert()
with an explicit if (...) { log_error("Nested sync signal, impossible!");BUG(); }
This way we'll also catch such bugs/corner cases in Release mode.
pal/src/host/linux-sgx/host_exception.c
line 123 at r1 (raw file):
assert(pal_get_host_tcb()->aex_sync_event == PAL_EVENT_NO_EVENT); pal_get_host_tcb()->aex_sync_event = event; pal_get_host_tcb()->sync_signal_cnt++;
This removal of sgx_raise(event)
changes flows of Gramine significantly. Since this change also applies to non-AEX-Notify flows, I would really prefer to have it as a separate, preparation PR.
I will also describe these new flows here: #1948
pal/src/host/linux-sgx/host_exception.c
line 171 at r1 (raw file):
DO_SYSCALL(tkill, g_rpc_queue->rpc_threads[i], SIGUSR2); assert(event == PAL_EVENT_INTERRUPTED || event == PAL_EVENT_QUIT);
This is unrelated move-of-assert and can be extracted into a separate PR (together with the other ones I mentioned elsewhere)
pal/src/host/linux-sgx/host_exception.c
line 209 at r1 (raw file):
* -- now the regular context must inform the enclave about these events. This function is * potentially noreturn -- if there is at least one signal, and the enclave is ready to handle it, * then the call to sgx_raise() never returns. Only one of potentially two signals (one sync and
sgx_raise()
doesn't return only if we take into account AEX Notify. But for old flows (without AEX Notify), sgx_raise()
will always return. This should be reflected in this comment.
pal/src/host/linux-sgx/host_exception.c
line 214 at r1 (raw file):
* cannot occur while in this function, but new async signals can occur (since we are in regular * context and cannot block async signals), thus handling async signals must be aware of concurrent * signal handling code. */
We must somehow note that async signals are special in Gramine -- there is only SIGCONT which is kinda dummy (can be ignored) and SIGTERM (which is injected only once anyway).
So we don't need a queue of pending async signals, we can get away with just a single slot (which is the pal_get_host_tcb()->aex_async_event
variable).
pal/src/host/linux-sgx/host_exception.c
line 228 at r1 (raw file):
PAL_EVENT_NO_EVENT, __ATOMIC_RELAXED); if (event != PAL_EVENT_NO_EVENT) { /* if async event is consumed by the enclave, then below sgx_raise() does not return;
Again, this misses the old, no-AEX-Notify flows. The comment must also explain that sgx_raise()
will always return in old flows.
pal/src/host/linux-sgx/host_exception.c
line 235 at r1 (raw file):
enum pal_event no_event = PAL_EVENT_NO_EVENT; __atomic_compare_exchange_n(&pal_get_host_tcb()->aex_async_event, &no_event, event, /*weak=*/false, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
What is this? Why do we re-add the async signal (event
) again as a pending signal to this enclave thread? We already consumed this event, what's the point of re-adding it.
Update: ok, this is probably because you assume AEX-Notify flows only in this code snippet. In the AEX-Notify case, sgx_raise()
returns and we end up here only if the enclave was in the middle of stage-1 handler. So this async signal (event) was not handled inside the enclave, so we re-instate it here. Ok, makes sense to me.
pal/src/host/linux-sgx/pal_exception.c
line 22 at r1 (raw file):
__attribute_no_sanitize_address noreturn static void apply_mitigation_handler_and_restore_sgx_context(sgx_cpu_context_t* uc, PAL_XREGS_STATE* xregs_state)
Fix code style and wrap to 100-chars-per-line
pal/src/host/linux-sgx/pal_exception.c
line 47 at r1 (raw file):
apply_mitigation_handler_and_restore_sgx_context(uc, xregs_state); else _restore_sgx_context(uc, xregs_state);
This looks ugly. Just do the check on AEX-Notify inside apply_mitigation_handler_and_restore_sgx_context()
, and call only this function here, without if-else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 24 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @lzha101)
pal/src/host/linux-sgx/host_exception.c
line 222 at r1 (raw file):
pal_get_host_tcb()->aex_sync_event = PAL_EVENT_NO_EVENT; sgx_raise(event); return;
We should not get to this return
(if we run with AEX-Notify enabled), because sync signals must always be consumed and then EDECCSSA'ed. I suggest to add the check:
if (aex_notify_enabled) {
log_error("Sync signal %d was not consumed by the enclave (in AEX-Notify mode)", event);
BUG();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 26 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @lzha101)
pal/src/host/linux-sgx/enclave_entry.S
line 266 at r1 (raw file):
# We are interrupted during the never-returning OCALL_EXIT. Because the thread is going to exit # anyway, we can ignore this exception. movq %rdx, %rbx # upon EENTER the exit address was in RDX, mov it to RBX for EEXIT
I was surprised by this because Intel SDM says that EENTER puts the exit address into RCX. And this asm code (that leads to this code snippet) doesn't copy RCX into RDX. So who put the exit address into RDX?
It turns out the untrusted asm code does this before calling EENTER to raise an exception (in SSA=1):
gramine/pal/src/host/linux-sgx/host_entry.S
Line 126 in 10e9353
leaq .Lafter_resume(%rip), %rdx |
This seems redundant, as we could just consult the RCX register. Note that we cannot consult the %gs:SGX_ECALL_RETURN_ADDR
variable because it is not be set in the exception-handling context (and it anyway pertains only to the normal context).
But need to be careful to make sure that RCX was not clobbered in the meantime.
Maybe a good candidate for a separate PR.
pal/src/host/linux-sgx/enclave_entry.S
line 574 at r1 (raw file):
# upon EENTER the exit address was in RDX, mov it to RBX for EEXIT movq %rdx, %rbx
ditto (seems redundant, as we could just consult the RCX register)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 24 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @lzha101)
pal/src/host/linux-sgx/enclave_entry.S
line 266 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I was surprised by this because Intel SDM says that EENTER puts the exit address into RCX. And this asm code (that leads to this code snippet) doesn't copy RCX into RDX. So who put the exit address into RDX?
It turns out the untrusted asm code does this before calling EENTER to raise an exception (in SSA=1):
gramine/pal/src/host/linux-sgx/host_entry.S
Line 126 in 10e9353
leaq .Lafter_resume(%rip), %rdx This seems redundant, as we could just consult the RCX register. Note that we cannot consult the
%gs:SGX_ECALL_RETURN_ADDR
variable because it is not be set in the exception-handling context (and it anyway pertains only to the normal context).But need to be careful to make sure that RCX was not clobbered in the meantime.
Maybe a good candidate for a separate PR.
I'm taking it back. We can't simplify this to use RCX instead of RDX.
That's because the exit address is not always RCX ("Address of IP following EENTER"). In particular, this is not true for OCALLs: in their cases, EEXIT must jump to the OCALL untrusted code, and so EENTER must be instructed to jump to that OCALL untrusted code via RDX. Here's the corresponding code:
leaq .Lsgx_entry(%rip), %rdx |
pal/src/host/linux-sgx/enclave_entry.S
line 574 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto (seems redundant, as we could just consult the RCX register)
I'm taking it back, we can't simplify that. See other comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 24 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @lzha101)
pal/src/host/linux-sgx/host_exception.c
line 235 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
What is this? Why do we re-add the async signal (
event
) again as a pending signal to this enclave thread? We already consumed this event, what's the point of re-adding it.Update: ok, this is probably because you assume AEX-Notify flows only in this code snippet. In the AEX-Notify case,
sgx_raise()
returns and we end up here only if the enclave was in the middle of stage-1 handler. So this async signal (event) was not handled inside the enclave, so we re-instate it here. Ok, makes sense to me.
Just to make me comment less confusing: I mean that in no-AEX-Notify flow, sgx_raise()
always returns here, so there is no point in re-injecting the same async signal (it was consumed). Thus, what this flow misses is a check like this:
sgx_raise(event);
/* we end up here if there's no AEX-Notify (but the signal was consumed) or there is AEX-Notify
* but couldn't EENTER because enclave executes stage-1 handler in SSA 1 (signal was not consumed) */
if (g_aex_notify_enabled) {
enum pal_event no_event = PAL_EVENT_NO_EVENT;
__atomic_compare_exchange_n(&pal_get_host_tcb()->aex_async_event, &no_event, event,
/*weak=*/false, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 26 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @lzha101)
a discussion (no related file):
I prepared the series of posts with diagrams on current Gramine signal-handling flows and on AEX-Notify flows: #1948
My proposal follows the implementation in this PR (bar some comments and nitpicks).
We need to discuss these posts and diagrams, to agree on the design choices & implementation.
a discussion (no related file):
Based on my explanations in #1948, I actually believe it will make more sense to have a preliminary PR that modifies signal handling flows without AEX-Notify first. That's the crux of my middle post: #1948 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 26 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @lzha101)
pal/src/host/linux-sgx/host_exception.c
line 235 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Just to make me comment less confusing: I mean that in no-AEX-Notify flow,
sgx_raise()
always returns here, so there is no point in re-injecting the same async signal (it was consumed). Thus, what this flow misses is a check like this:sgx_raise(event); /* we end up here if there's no AEX-Notify (but the signal was consumed) or there is AEX-Notify * but couldn't EENTER because enclave executes stage-1 handler in SSA 1 (signal was not consumed) */ if (g_aex_notify_enabled) { enum pal_event no_event = PAL_EVENT_NO_EVENT; __atomic_compare_exchange_n(&pal_get_host_tcb()->aex_async_event, &no_event, event, /*weak=*/false, __ATOMIC_RELAXED, __ATOMIC_RELAXED); }
For info: my proposed code snippet was also wrong, because in no-AEX-Notify flow, if sgx_raise()
failed to inject the async signal (because CSSA=2 and EENTER faults with a #GP), then this signal would be lost. (That's because aex_async_event
is reset to NO_EVENT in __atomic_exchange_n()
above, and there is no code that would restore it back to event
in case of failed signal injection).
I fix it in my new version of the code by setting aex_async_event
to the value of RDI in the special case of handle_sync_signal()
with PAL_EVENT_MEMFAULT (see code above in this file).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 26 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @lzha101)
pal/src/host/linux-sgx/host_entry.S
line 114 at r1 (raw file):
Well, actually, doesn't this introduce a heavy performance hit? On each AEX (without any real signals), we will call into the stage-1 handler, then stage-2 handler will handle dummy SIGCONT, and only then the execution will continue. Sounds like a lot of wasted CPU cycles.
See this answer from one of AEX Notify authors: #1948 (reply in thread)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 26 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @lzha101)
pal/src/host/linux-sgx/enclave_entry.S
line 621 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Even though an async signal can arrive at this exact
jmp
instruction, this should be safe. Because the SSA0 can be considered having the correct and consistent context for the "stage-2 exception handler", including this R11 register which points to the stage-2 exception handling routine in our C code.That's just info for other reviewers and authors. I hope I understand this corner-case data race correctly.
I was wrong. If an async signal would happen here, then SSA[0].GPRSGX.RIP
would point to this jmp instruction, and the new stage-1 singal handler will end up doing movq SGX_GPR_RIP(%rbx), %r11
(see above), which means that R11 will contain RIP of this jmp instruction, basically meaning jmp *itself
, which results in an infinite loop of this instruction jumping onto itself. So this code is wrong and may lead to indeterministic hangs.
In my new PR, I modified this corner case to skip jmp
when this is detected.
According to PR #1488, will construct three PRs to support AEX-Notify in Gramine. The first part could be found in #1530. This PR includes the second part.
Add support of adding
sgx.enable_aex_notify = true
to manifest. When AEX-Notify is enabled, will not exit the enclave during the stage-1 exception handler. Instead, a new instruct EDECCSSA will be triggered to switch to SSA0 and directly jump to stage-2 exception handler.Description of the changes
How to test this PR?
This change is