-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dearbitrary #187
base: main
Are you sure you want to change the base?
Dearbitrary #187
Conversation
Apologies for how long its taken me to respond to this PR. I'm concerned about the complexity that this feature brings. I'm not convinced that it is worth it, just to be able to seed a corpus. If seeding a corpus is an important use case, it can be done today by changing how the structure-aware fuzzing is done from a generative paradigm to a mutation paradigm:
(But backing up: It should also be noted that randomly generating a corpus before you start fuzzing isn't going to be expected to do any better than incrementally building the corpus from nothing. Seeding the corpus is only useful if you already have inputs that you know are interesting for other reasons, for example they've triggered bugs/crashes before or are Real World snippets of your programming language or etc...) |
In general, the mutation paradigm has another benefit over the generative paradigm as well: if you add knobs to only do shrinking/simplifying mutations, then it is trivial to build a test case minimizer on top of that which performs better than let input = the_fuzzer_input();
let result = run(&input);
let alt_input = input.mutate_preserving_semantics();
let alt_result = run(&alt_input);
assert_eq!(
result,
alt_result,
"running two semantically-equivalent programs should produce the same results",
); |
(we should probably have a crate that is the mutation-paradigm-equivalent of the |
By the way, my use of seeding was incorrect. I meant more for large arbitrary-derived structures, as developing an initial corpus is painstaking. Still, without it, the initial fuzzing warmup takes longer. However, the inconvenience also means that some libraries go without a corpus for those types of structures. While I do have a limited understanding of fuzzers, aren't most fuzzers mutative instead of generative already? Wouldn't methods like |
Resolves #44.
I would greatly appreciate feedback on naming the structs & associated methods with
UnstructuredBuilder
.dearbitrary
intentionally buildsUnstructured
in reverse to counter the problems encountered in WIP: Adddearbitrary
functionality to turn an instance into its arbitrary byte sequence #94 (the slicelength
optimizations) by ensuring that size is built in reverse.kani
is used to fastly assure primitive dearbitary/rearbitrary cases. I am still proof-checking the other primitives and will write fuzzing tests for the non-verifiable implementations.Todo:
UnstructuredBuilder
DocumentationDearbitrary
Documentationu16
range test - currently takes 10 minutes to verify, so I've commented it for the time being to save on my testing time.Notes:
dearbitrary
be hidden behind a feature?Int
is currently a breaking API change. Can this be fixed without manually implementingfrom
/to
le
bytes?f32/f64
representation isn't representable cross-platform ifMIPS
is used as the architecture set.