-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace memcmp to s2n_constant_time_equals #4709
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also update grep_simple_mistakes as part of this PR.
12f4ff8
to
b95a1fd
Compare
* Let s2n_stuffer_read_expected_str use memcmp to avoid CBMC problem.
b95a1fd
to
359e689
Compare
* Adding one more known memcmp location to grep_simple_mistakes.sh.
* Replace S2N_ERROR_IF to POSIX_ENSURE.
* Reorder the .sh order to fit the directory order.
What's the performance impact of this change? libc's memcmp and memcpy are highly optimized to do word-wise comparisons, pipeline prefetching, etc. |
Remove the usage of s2n_constant_time_equals from any functions where it is possible to compare relatively large amounts of data (~ > 1 kB) even if that scenario is unlikely.
Hi Colm! For context, the goal of this PR is to reduce the number of places that we are using memcmp. This helps reduce the risk of sidechannels by removing data-dependent comparisons. In-depth benchmarks were added to the PR description. For the specific s2n-tls configuration that we use for handshake benchmarks:
This is in line with expectations, because most of the places that memcmp was replaced were comparing small, statically sized amounts of data. The places that aren’t comparing statically sized data are comparing small, bounded amounts of data. For these bounded comparisons, s2n_constant_time_equals adds ~130 ns. For more information see s2n_constant_time_equals vs memcmp Benchmarks in the PR description. non static, bounded data
Lines 335 to 340 in 87f4a05
s2n-tls/tls/s2n_server_hello.c Line 139 in fcc3184
Note: This PR was originally changing some “generally small” cases, but there was the possibility of pathological inputs, like a client sending 16 kB of application protocols. #4717 showed that this would have cost ~3 µs, so these cases were reverted. |
Resolved issues:
Solving issue #3062
Description of changes:
Change most of memcmp usages to s2n_constant_time_equals.
There are two parts that weren't change, which are in s2n_cipher_suits.c (line 1110) and s2n_config.c (line 323). Both of those parts use the return integer from memcmp which s2n_constant_time_equals can't do.
Regression Tests Results
We recently added new regression tests that lets us know the instruction count differences over some common codepaths.
This shows us that there is about a 1,707 instruction count penalty for shifting to s2n_constant_time_equals. The main differences are the additional 2,190 instruction from s2n_constant_time_equals and the 384 instruction count reduction from memcmp. Relative to the 79,292,794 instructions for a complete RSA handshake, this is a 0.00215 % regression in handshake performance.
Note: The hashbrown instruction count differences are due to randomness between runs.
Handshake Benchmarks
We also have benchmarks covering handshake and throughput performance. Comparing these benchmarks between this PR and mainline we see the following differences.
As expected, we do not see any significant changes. While there are some slight differences reported, these are below the accuracy threshold of the benchmarks.
s2n_constant_time_equals vs memcmp Benchmarks
We ran criterion benchmarks comparing the performance of
memcmp
againsts2n_constant_time_equals
.These cases give numbers for the common case of comparing small pieces of data.
This is a representation of a pathological case possible in places like s2n_protocol_preferences.c. Generally there would be only a small number of protocols, but it is possible for a client to send up to 16 kB of preferences, where each protocol has a maximum size of 255 bytes. This case gives us an understanding of the performance impact in this worst case scenario.
We see that while s2n_constant_time_equals is significantly slower than memcmp, it remains a very fast function, and its cost is small relative to the cost of an entire handshake (120ns / 1_000_000ns => .0000012% ).
For pathological cases (comparing an entire 16kb extension) the s2n_const_time_equals would represent a noticeable portion of the handshake (30us / 1_000us => 3%). Therefore we avoid the usage of s2n_constant_time_equals in scenarios where largeish data (> 1kB) would be compared.
Call-outs:
Needs to make a note to those two parts that aren't changed.
Testing:
This change is a refactor change. The code ran and passed all test cases via cmake.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.