-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TLS: further performance improvements and cleanups #1064
Comments
firstly in DEFINE_TLS_TEST()->kernel_fpu_begin() and secondly in ttls_ecp_group_free()->ttls_bzero_safe()->kernel_fpu_begin(). The fix moves all the TLS unit tests to test_tls.c from tls/ and make each test responsible for calling kernel_fpu_{begin,end}(). The crypto routines can be split into 2 groups: called from process context of Tempesta FW initialization and called in run-time, softirq context. Only the second group must be called with saved FPU context. In fact, current crypto routines (covered by the test) don't use SIMD much and this is going to change in #1064.
firstly in DEFINE_TLS_TEST()->kernel_fpu_begin() and secondly in ttls_ecp_group_free()->ttls_bzero_safe()->kernel_fpu_begin(). The fix moves all the TLS unit tests to test_tls.c from tls/ and make each test responsible for calling kernel_fpu_{begin,end}(). The crypto routines can be split into 2 groups: called from process context of Tempesta FW initialization and called in run-time, softirq context. Only the second group must be called with saved FPU context. In fact, current crypto routines (covered by the test) don't use SIMD much and this is going to change in #1064.
Current profile under
with configuration
And at least x15 times better handshake performance is required. |
#1064: small MPI cleanups and improvements
With 8177b43
|
Current perf profile with FIPS algorithm for modulo reduction implemented in assembly:
|
#1064: TLS performance imporovements
With #1405 we outperform Nginx/OpenSSL in about 50%. Tested against 1CPU KVM virtual machine with the benchmark with https://github.com/tempesta-tech/tls-perf run from the host as
Nginx/OpenSSL:
Tempesta FW:
|
Profiled web cache performance through TLS against 19KB data (index.html of tempesta-tech.com):
Pure proxying of 2-bytes file doesn't expose any copying issues (note that the backend Apache HTTPD was also running on the same VM with Tempesta FW):
|
Still in progress, current benchmarks aginst Nginx with OpenSSL and WolfSSL (1CPU VM, Nginx-1.14.2/OpenSSL-1.1.1d
Nginx-1.17.8/WolfSSL
Tempesta TLS
|
Changeds for #614 have grown significantly, so following tasks are move from #614 scope:
tls/test_tls_cert.py
if necessary).dummy_headers
and replace GCC SIMD intrinsics with assembly codeTTLS_PK_PARSE_EC_EXTENDED
, SECG SEC 1.memcpy()
calls fromskb_copy_*()
calls byfast_memcpy()
. Probably other standard functions usingmemcpy()
,memset()
ormemcmp()
can be accelerated.Reuse Karatsuba precomputations for AES in the same TLS connection, TLS performance characterization on modern x86 CPUsMoved to Crypto extensions and performance #1335.memset()
s aren't optimized outNot part of the issue, but Linux crypto API also has performance issues, so our effort to get zero-copy TLS starves on the underlying API. E.g.
gcmaes_encrypt()
in most cases goes throughkmalloc()
path with 2(!) data copies. See the comment #1064 (comment) - fixing the Linux crypto API issue we can improve large data transfers performance.Testing
The new crypto routines must be unit tested (see
mbedtls/crypto/tests/suites/
).Functional test for simultaneous accept HTTP and HTTPS connections with and w/o HSTS redirections (Custom HTTP redirects #856), at least(Tested manually and got vhost crash, moved the task to Сrash on bad vhost configuration #1439).Strict-Transport-Security
must work for now.The text was updated successfully, but these errors were encountered: