-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix performance issue introduced by vendor code #288
Conversation
9342154
to
7da3fd3
Compare
Maybe I need to export a C define from |
2977bc8
to
09720b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I don't know if changing the underlying implementation has any nontrivial repercussions...
I haven't looked at this in full detail yet, but one thing I'm not thrilled about is directly conditioning the code on This is a nice thing to do because at some point in the future there might be some other non-SGX platform that wants to use these alternatives for memory access. And it would be weird if they had to define SGX to get the behavior they want, since since that platform wouldn't have anything to do with SGX. So I think you want to conditionalize the code on a macro named something like Minor C language thing: identifiers starting with double underscores are reserved for use by the platform. So things like the compiler or the system headers can define idenfiers that start with double underscores. Since this is something that we're defining, it shouldn't use double underscores. The standard library also uses some identifiers with single underscores, so those should be avoided as well. |
HI @arai-fortanix , One possible solution here is to also add similar checks on About conditional compiling, I have updated code to always keep these |
This PR doesn't make sense to me. The code is clearly intended to get optimized so that there are no actual calls to memcpy. If our generated code still has calls to memcpy, we need to fix how the code is compiled. Having compiled both Linux and SGX in release mode:
So it's clear that the compiler fails to properly optimize out the call to memcpy. The fix should be in the compilation, not the code. |
So the reason why Then with If assuming
The second solution is easy to do but need to change vendor code. |
7df0f58
to
ab4fd4d
Compare
We can just add this as a separate include file that we pass to CMake, right? Or, similary, put it in the config.h. |
Ideally we don't disable builtins that we provide in strcpy. But there's no way to instruct the compiler, "ignore all builtins, except these specific ones"? |
Hi @jethrogb ,
As far as I know after some search on internet, the only way is to add something like And as you can see this PR, I just updated, because of the limitation of C macro, this kind of C macros need to:
|
We definitely need to disable some builtins, since we don't have a complete standard C library when we're compiling for SGX, and some of the builtins assume C library specifics. I think there's a way around the macro expansion issue, but I'll need to look into some specifics before I can offer a solution to that. |
Hi @arai-fortanix , @jethrogb |
Github is break now, https://www.githubstatus.com/ |
I'm not thrilled by forcing an inclusion of string.h. But that makes me wonder, which string.h are we including, and are we including the appropriate support functions for that string.h in the build environment? The reason I'm asking is that we had a bug in zircon where we were compiling code using glibc's ctype.h header, but we were linking in the MUSL code for ctype. This led to bugs where the glibc code in the ctype header was calling a MUSL function with the same name, but different semantics, which led to code not operating correctly. I want to make sure we don't have a similar problem here. |
Just fixed the CI. |
Me too. But this is only way how we could avoid changing vendor code. I am thinking, is it possible to have a CMakeList.txt on top of |
Performance issue about unaligned memory access
In
mbedtls
3.4.0
, upstream refactor the code of accessing unaligned address from using bit calculation to use pointer +memcpy
+ clang's bitswap.This is fine in no-sgx environment, but become very slow in fortanix SGX environment.
I think the most possible reason here is call of
memcpy
.This PR mainly replace all these new functions with old implementation in https://github.com/Mbed-TLS/mbedtls/blob/v3.3.0/library/common.h
Performance tests result:
Command used to run bench:
cargo +stable bench --no-default-features --features dsa,force_aesni_support,mpi_force_c_code,rdrand,std,time --target=x86_64-fortanix-unknown-sgx
After this PR, the performance number back to be as same as
yx/mbedtls-9_bench
onePerformance issue about
explicit_bzero
(debug mode only)When providing the
explicit_bzero
function through rust, there is a big performance downgrade in SGXFor example, when running the
pbkdf
:Command:
cargo +stable nextest run --test pbkdf --no-default-features --features dsa,force_aesni_support,mpi_force_c_code,rdrand,std,time --target=x86_64-fortanix-unknown-sgx
After ensure C mbedtls side to not call
explicit_bzero
, the time reduce toThis performance issue should be not related to usage of
Zeroize
crate, because above tests results are reproduced by following minimal implementation ofexplicit_bzero
: