Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fuzz: H1 capture fuzz test performance improvements. #10281

Merged
merged 1 commit into from
Mar 8, 2020

Conversation

htuch
Copy link
Member

@htuch htuch commented Mar 6, 2020

The main contribution in this patch is a "persistent" mode for
h1_capture_[direct_response_]fuzz_test. Based on profiling observations, we were spending 30-40% of
time rebuilding the Envoy server on each run. This is avoided by having fuzzer variants that makes
the integration test proxy static.

There is a downside of this approach, since different fuzz invocations may interfere with each
other. Ideally we would snapshot/fork for each fuzz invocation, but Envoy doesn't like forking once
events/dispatchers are up. So, for now we have two builds of the fuzzer, where we trade fuzz engine
efficacy for fuzz target performance. Some form of VM snapshotting would be ideal.

The persistent mode takes the H1 replay tests to O(10 exec/s) from O(1 exec/s). This is still not
great. Doing some perf analysis, it seems that we're spending the bulk of time in ASAN. Running
the fuzzers without ASAN gives O(100 exec/s), which seems reasonable for a LPM-style integration test.
It's future work why ASAN is so expensive, ASAN advertises itself as generally a 2x slowdown. There
is also some secondary effect from the cost of mocks used in the integration test TCP client (mock
watermark buffer), this speaks to our general mocking performance problem in fuzzing.

In addition to the above, this patch has an optimization for the direct response fuzzer (don't
initiate upstream connections) and a --config=plain-fuzz mode for peformance work without
confounding ASAN.

Risk level: Low
Testing: Manual bazel runs of the fuzzers, observing exec/s.

Signed-off-by: Harvey Tuch htuch@google.com

The main contribution in this patch is a "persistent" mode for
h1_capture_[direct_response_]fuzz_test. Based on profiling observations, we were spending 30-40% of
time rebuilding the Envoy server on each run. This is avoided by having fuzzer variants that makes
the integration test proxy static.

There is a downside of this approach, since different fuzz invocations may interfere with each
other. Ideally we would snapshot/fork for each fuzz invocation, but Envoy doesn't like forking once
events/dispatchers are up. So, for now we have two builds of the fuzzer, where we trade fuzz engine
efficacy for fuzz target performance. Some form of VM snapshotting would be ideal.

The persistent mode takes the H1 replay tests to O(10 exec/s) from O(1 exec/s). This is still not
great. Doing some perf analysis, it seems that we're spending the bulk of time in ASAN. Running
the fuzzers without ASAN gives O(100 exec/s), which seems reasonable for a LPM-style integration test.
It's future work why ASAN is so expensive, ASAN advertises itself as generally a 2x slowdown. There
is also some secondary effect from the cost of mocks used in the integration test TCP client (mock
watermark buffer), this speaks to our general mocking performance problem in fuzzing.

In addition to the above, this patch has an optimization for the direct response fuzzer (don't
initiate upstream connections) and a --config=plain-fuzz mode for peformance work without
confounding ASAN.

Risk level: Low
Testing: Manual bazel runs of the fuzzers, observing exec/s.

Signed-off-by: Harvey Tuch <htuch@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat. LGTM but will defer to @asraa who has more knowledge about this code.

Copy link
Contributor

@asraa asraa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I'm curious how this will play out.

Just FYI OSS-Fuzz tries to "cross-pollinate" corpora from different fuzz targets every day. Hoping to get the existing corpus for h1 cross-pollinated into this one ASAP. I'm not sure if it tries corpus cross pollination for all combinations of fuzz targets every day, but eventually it will (?)

@@ -31,6 +36,9 @@ void H1FuzzIntegrationTest::replay(const test::integration::CaptureFuzzTestCase&
// TODO(htuch): Should we wait for some data?
break;
case test::integration::Event::kUpstreamSendBytes:
if (ignore_response) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice.

@htuch
Copy link
Member Author

htuch commented Mar 8, 2020

Force merging as coverage failures are unrelated.

@htuch htuch merged commit 226a603 into envoyproxy:master Mar 8, 2020
@htuch htuch deleted the persistent-fuzz branch March 8, 2020 20:35
@jmarantz
Copy link
Contributor

jmarantz commented Sep 8, 2023

This test seems very flaky, particularly on ARM. I noticed this block:

#ifdef PERSISTENT_FUZZER
#define PERSISTENT_FUZZ_VAR static
#else
#define PERSISTENT_FUZZ_VAR
#endif

which means sometimes we are staticly defining an object-by-value, which is evil generally. But specifically it means that it gets destructed after main(). In ARM CI I noticed all the test methods would pass, and then the binary would crash after all the tests were complete.

Should we make this a lazy-init pointer rather than a static? These test-flakes are a real drag.

@yanavlasov @krajshiva FYI

@jmarantz
Copy link
Contributor

jmarantz commented Sep 8, 2023

Oh I see. I thought I had a fix for this but tests fail if they don't clean up mocks.

@jmarantz
Copy link
Contributor

jmarantz commented Sep 8, 2023

#29522 should resolve

htuch pushed a commit that referenced this pull request Sep 11, 2023
…han static pods (#29522)

 #10281 introduced persistent fuzz state to avoid rebuilding complex test infrastructure between test methods, and improve fuzzing performance. However it introduced a static-init fiasco by statically initiailizing a non-pod, which has non-deterministic destruction order compared to other static-inits.

This causes flaky tests, particularly on ARM.

This PR adds a new mechanism to the fuzz-runner infrastructure to allow cleanup hooks to be established, that will be run after all test methods but before main() returns, in a deterministic, platform-independent order.

Additional Description: n/a
Risk Level: low -- test only
Testing: //test/integration/...

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants