-
-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter Duplicate Input Execution #2771
Filter Duplicate Input Execution #2771
Conversation
* Rules * more * aa
* fixing empty multipart name * fixing clippy * improve flexibility of DumpToDiskStage * adding note to MIGRATION.md
Updates the requirements on [bindgen](https://github.com/rust-lang/rust-bindgen) to permit the latest version. - [Release notes](https://github.com/rust-lang/rust-bindgen/releases) - [Changelog](https://github.com/rust-lang/rust-bindgen/blob/main/CHANGELOG.md) - [Commits](rust-lang/rust-bindgen@v0.70.1...v0.71.1) --- updated-dependencies: - dependency-name: bindgen dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* no from stage * fixer * doc fix * how was this working???? * more fixes * delete more * rq * cargo-fuzz * m * aa
* go * fixing stuf * hello from windows * more * lolg * lolf * fix * a --------- Co-authored-by: Your Name <you@example.com>
* Maybe fix CI * does this help? * Very dirty 'fix'
* fixing empty multipart name * fixing clippy * New rules for the contributing (AFLplusplus#2752) * Rules * more * aa * Improve Flexibility of DumpToDiskStage (AFLplusplus#2753) * fixing empty multipart name * fixing clippy * improve flexibility of DumpToDiskStage * adding note to MIGRATION.md * Introduce WrappingMutator * introducing mutators for int types * fixing no_std * random fixes * Add hash derivation for WrappingInput * Revert fixes that broke things * Derive Default on WrappingInput * Add unit tests * Fixes according to code review * introduce mappable ValueInputs * remove unnecessary comments * Elide more lifetimes * remove dead code * simplify hashing * improve docs * improve randomization * rename method to align with standard library * add typedefs for int types for ValueMutRefInput * rename test * add safety notice to trait function * improve randomize performance for i128/u128 * rename macro * improve comment * actually check return values in test * make 128 bit int randomize even more efficient * shifting signed values --------- Co-authored-by: Dongjia "toka" Zhang <tokazerkje@outlook.com> Co-authored-by: Dominik Maier <domenukk@gmail.com>
Or even |
As I stated in the discussion thread, I think a method for rejecting inputs that were already tried would be more useful (but I don't know your use case, so..) |
I'm targeting the TCP/IP stack of an OS, so each execution takes in the order of magnitude of 1s, although most of that is spent in wait states (hence previous work like overcommit). Even still, the added runtime of this would be nothing compared to the execution, so this felt like an easy win.
Something like this would definitely further improve the situation. Do you suggest creating a wrapping executor that returns either Tracing this back it seems most appropriate in the stage? But that seems not that generic. So maybe in I'm also not sure if there's an opportunity here to combine this somehow with |
I think it could simply wrap an executor, yeah. And have an extra observation that's "skipped" -if it's true the testcase isn't interesting. Should be easy enough to do. \We can still merge this PR as well, but the feedback should be renamed IMHO. |
How about something like this? |
I'll do some performance comparisons later today. Initial runs suggest that adding even a 10µs sleep to the harness reduces the performance penalty to <5%. I might also see how many duplicate inputs actually appear. But for now I feel like for slow targets this very well might be worth using. |
Alright, some performance tests. Running against the
All these numbers obviously depend on the exact fuzzers:
Overall, I feel like this may be worth having in the library. Btw: There is no easy way of adding metadata to the state such that it is printed by monitors, right? Otherwise, calculating the number/rate of duplicates may be an interesting addition. |
There is an easy way, using |
Alright, I've implemented some of that. I have changed the following stages to use the new
Notable unchanged stages (please check them yourself, I know only very little about the many different stages):
|
/// This is achieved by hashing each input and using a bloom filter to differentiate inputs. | ||
/// | ||
/// Use this implementation if hashing each input is very fast compared to executing potential duplicate inputs. | ||
pub fn with_bloom_input_filter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a with_input_filter
method and then you just provide a BloomInputFilter
there (or we keep this one as extra constructor)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a generic version and changed both new
and with_bloom_input_filter
to use this one.
Now it looks very good! Just left some minor nitpicks. Thank you and merry christmas! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merry christmas 🎅
} | ||
|
||
#[cfg(feature = "std")] | ||
impl<I: Hash> InputFilter<I> for BloomInputFilter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this I: Hash necessary? if not can you delete?
(always, keep the contraints minimal)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the bloom filter checks presence based on the hash value of the input.
Thank you for the quick responses! I hope you had relaxing holidays anyways :D The remaining CI issues seem unrelated(?) |
See #2759.
Some mutators report
MutationResult::Mutated
, even if nothing actually changes about the input.HashMutator
is a wrapper around other mutators that hashes inputs pre- and post-mutation to ensureMutationResult::Mutated
is only reported if something actually changed.This may be worth using on slow targets, where the hashing is quicker than the unnecessary additional executions of the target for previously tried inputs.