Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for crash in do_ocall to prevent any data leakage. #138

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

deeglaze
Copy link
Contributor

This PR is a hardening measure, as there is no real way to "safely abort" from a multithreaded user-space program without OS assistance or defensive coding on behalf of the user.

Enclaves threads run in user space, so there is no way to signal to other threads to abort other than with shared state or depending on an untrusted operating system. When one thread aborts, g_enclave_state is set to $ENCLAVE_CRASHED, so another thread that might be reading from corrupted state (because we aborted on an error condition) is prevented from exposing secrets accidentally. This doesn't stop other threads from directly manipulating untrusted memory (possibly in a loop), but a loop will likely call something similar to sched_yield (implemented as an ocall to the host) and then be stopped by the check in do_ocall.

Consider exposing an SDK call in sdk/trts/trts.cpp (declared in common/inc/sgx_trts.h) for defensive programming,

int sgx_is_enclave_crashed() {
  return get_enclave_state() == ENCLAVE_CRASHED;
}

@yuyuany
Copy link
Contributor

yuyuany commented Aug 16, 2017

That's a good suggestion. To leave the assembly code simple, how about to move the check to the C code in sdk/trts/trts_ocall.cpp, and return an error code if the enclave state is ENCLAVE_CRASHED.

 sgx_status_t sgx_ocall(const unsigned int index, void *ms)
 {
+    // check for crash to prevent any data leakage
+    if(get_enclave_state() == ENCLAVE_CRASHED) {
+        return SGX_ERROR_ENCLAVE_CRASHED;
+    }
     // sgx_ocall is not allowed during exception handling
     thread_data_t *thread_data = get_thread_data();

@deeglaze
Copy link
Contributor Author

By doing that, you introduce a user obligation to check for a crashed error for every ocall. I guess that's okay for this API level. For our use case we tend to expect ocalls work as expected and just want an ocall when crashed to wait to be torn down. We can push that to our use cases though.

@deeglaze deeglaze reopened this Aug 16, 2017
@yuyuany
Copy link
Contributor

yuyuany commented Aug 22, 2017

We have more discussion on this PR. If we want to block OCALL when enclave is in crashed state, we can do:

  1. Return error code to user. But user may not check the error, and does not know what to do.
  2. Deadloop in the thread. But CPU is always occupied, and even block the enclave removal.
  3. Call abort to terminate the thread. It raises another exception signal to user from the bug code thread. It makes user hard to debug/handle the exception.

None of the soultions is good. We'd better leave the code unchanged. When enclave is crashed, we do not block an OCALL, but we block the OCALL return. Once the thread exits the enclave, it cannot get back. OCALL is designed for untrusted function calls, it's not expected to expose any secret.

@deeglaze
Copy link
Contributor Author

I have another proposal: do an ocall to a fixed untrusted function that takes 0 arguments, like double_fault() or crashed_exit() or bikeshed().

1 or 2 are still preferable to inadvertent data leakage. If an enclave is in an unexpected state, the arguments to an ocall can be compromised from a corrupted pointer. Abort is "batten down the hatches" and let nothing in or out.

For 1, the user has to check every ocall for crash or risk strange behavior and likely a #GP or #PF. This is similar to 3, but the additional error is at least preventable. If no exception happens, you're still using CPU without pause in the enclave until something like a return from an ecall. This is an EEXIT, which we could issue on a crashed state ocall (the above proposal).

For 2, even with a pause instruction, we don't expect a preempting kernel to be able to stop the enclave's execution? Or is this SDK targeting non-preempting kernels too?

@yuyuany
Copy link
Contributor

yuyuany commented Aug 31, 2017

Rather than call a fixed untrusted function, why don't we exit the enclave directly? If the enclave is crashed, ocall call a fixed trusted function that make the ecall return with error code SGX_ERROR_ENCLAVE_CRASHED. In that case, the thread can exit the enclave as soon as it detects the crash state, and no data leak.

@yuyuany
Copy link
Contributor

yuyuany commented Sep 1, 2017

See PR #151 for the new solution.

@deeglaze
Copy link
Contributor Author

deeglaze commented Sep 1, 2017

Agreed that PR #151 is a better solution.

@deeglaze deeglaze closed this Sep 1, 2017
@yuyuany
Copy link
Contributor

yuyuany commented Sep 7, 2017

There's an issue in PR #151. When the tRTS checks for a crash in the OCALL, the parameters have already been put on the outside stack, by the bridge function generated by edger8r tool. So if we really want to prevent copying the OCALL parameters, we need to insert the check in the bridge function.

I still think we'd better leave the code unchanged. User should be aware that OCALL is an untrusted function, and no security can be passed though OCALL parameters. If the enclave is hacked, it's much easier to write the sensitive data outside enclave than copy it to the OCALL parameters. So the likelihood of leaking secret information through OCALL parameters is small.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants