-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fuzz: H1 capture fuzz test performance improvements. #10281
Conversation
The main contribution in this patch is a "persistent" mode for h1_capture_[direct_response_]fuzz_test. Based on profiling observations, we were spending 30-40% of time rebuilding the Envoy server on each run. This is avoided by having fuzzer variants that makes the integration test proxy static. There is a downside of this approach, since different fuzz invocations may interfere with each other. Ideally we would snapshot/fork for each fuzz invocation, but Envoy doesn't like forking once events/dispatchers are up. So, for now we have two builds of the fuzzer, where we trade fuzz engine efficacy for fuzz target performance. Some form of VM snapshotting would be ideal. The persistent mode takes the H1 replay tests to O(10 exec/s) from O(1 exec/s). This is still not great. Doing some perf analysis, it seems that we're spending the bulk of time in ASAN. Running the fuzzers without ASAN gives O(100 exec/s), which seems reasonable for a LPM-style integration test. It's future work why ASAN is so expensive, ASAN advertises itself as generally a 2x slowdown. There is also some secondary effect from the cost of mocks used in the integration test TCP client (mock watermark buffer), this speaks to our general mocking performance problem in fuzzing. In addition to the above, this patch has an optimization for the direct response fuzzer (don't initiate upstream connections) and a --config=plain-fuzz mode for peformance work without confounding ASAN. Risk level: Low Testing: Manual bazel runs of the fuzzers, observing exec/s. Signed-off-by: Harvey Tuch <htuch@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat. LGTM but will defer to @asraa who has more knowledge about this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I'm curious how this will play out.
Just FYI OSS-Fuzz tries to "cross-pollinate" corpora from different fuzz targets every day. Hoping to get the existing corpus for h1 cross-pollinated into this one ASAP. I'm not sure if it tries corpus cross pollination for all combinations of fuzz targets every day, but eventually it will (?)
@@ -31,6 +36,9 @@ void H1FuzzIntegrationTest::replay(const test::integration::CaptureFuzzTestCase& | |||
// TODO(htuch): Should we wait for some data? | |||
break; | |||
case test::integration::Event::kUpstreamSendBytes: | |||
if (ignore_response) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice.
Force merging as coverage failures are unrelated. |
This test seems very flaky, particularly on ARM. I noticed this block:
which means sometimes we are staticly defining an object-by-value, which is evil generally. But specifically it means that it gets destructed after main(). In ARM CI I noticed all the test methods would pass, and then the binary would crash after all the tests were complete. Should we make this a lazy-init pointer rather than a static? These test-flakes are a real drag. |
Oh I see. I thought I had a fix for this but tests fail if they don't clean up mocks. |
#29522 should resolve |
…han static pods (#29522) #10281 introduced persistent fuzz state to avoid rebuilding complex test infrastructure between test methods, and improve fuzzing performance. However it introduced a static-init fiasco by statically initiailizing a non-pod, which has non-deterministic destruction order compared to other static-inits. This causes flaky tests, particularly on ARM. This PR adds a new mechanism to the fuzz-runner infrastructure to allow cleanup hooks to be established, that will be run after all test methods but before main() returns, in a deterministic, platform-independent order. Additional Description: n/a Risk Level: low -- test only Testing: //test/integration/... Signed-off-by: Joshua Marantz <jmarantz@google.com>
The main contribution in this patch is a "persistent" mode for
h1_capture_[direct_response_]fuzz_test. Based on profiling observations, we were spending 30-40% of
time rebuilding the Envoy server on each run. This is avoided by having fuzzer variants that makes
the integration test proxy static.
There is a downside of this approach, since different fuzz invocations may interfere with each
other. Ideally we would snapshot/fork for each fuzz invocation, but Envoy doesn't like forking once
events/dispatchers are up. So, for now we have two builds of the fuzzer, where we trade fuzz engine
efficacy for fuzz target performance. Some form of VM snapshotting would be ideal.
The persistent mode takes the H1 replay tests to O(10 exec/s) from O(1 exec/s). This is still not
great. Doing some perf analysis, it seems that we're spending the bulk of time in ASAN. Running
the fuzzers without ASAN gives O(100 exec/s), which seems reasonable for a LPM-style integration test.
It's future work why ASAN is so expensive, ASAN advertises itself as generally a 2x slowdown. There
is also some secondary effect from the cost of mocks used in the integration test TCP client (mock
watermark buffer), this speaks to our general mocking performance problem in fuzzing.
In addition to the above, this patch has an optimization for the direct response fuzzer (don't
initiate upstream connections) and a --config=plain-fuzz mode for peformance work without
confounding ASAN.
Risk level: Low
Testing: Manual bazel runs of the fuzzers, observing exec/s.
Signed-off-by: Harvey Tuch htuch@google.com