Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter Duplicate Input Execution #2771

Merged
merged 44 commits into from
Dec 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
aefb8e3
fixing empty multipart name
riesentoaster Dec 6, 2024
a98c981
fixing clippy
riesentoaster Dec 6, 2024
7acf5a3
Merge branch 'main' into main
riesentoaster Dec 6, 2024
2da6dc5
New rules for the contributing (#2752)
tokatoka Dec 6, 2024
1e571a0
Improve Flexibility of DumpToDiskStage (#2753)
riesentoaster Dec 8, 2024
d020b9e
Update bindgen requirement from 0.70.1 to 0.71.1 (#2756)
dependabot[bot] Dec 11, 2024
e1d0b92
No Use* from stages (#2745)
tokatoka Dec 12, 2024
c842eda
Update CONTRIBUTING.md MIGRATION.md (#2762)
tokatoka Dec 12, 2024
31d9b56
No Uses* from `fuzzer` (#2761)
tokatoka Dec 12, 2024
c9eb2a8
Remove useless cfgs (#2764)
tokatoka Dec 12, 2024
93b64f9
Link libresolv on all Apple OSs (#2767)
mineo333 Dec 14, 2024
294d2f1
Somewhat ugly CI fix... (#2768)
domenukk Dec 15, 2024
c170986
Add Input Types and Mutators for Numeric Types (#2760)
riesentoaster Dec 15, 2024
bab9890
Add HashMutator
riesentoaster Dec 15, 2024
71fc1c6
Fix docs
riesentoaster Dec 15, 2024
a2fa10c
Merge branch 'main' into add-label-mutationresult
riesentoaster Dec 15, 2024
30e1db4
Fix docs again
riesentoaster Dec 15, 2024
025a56a
introducing bloom filter
riesentoaster Dec 17, 2024
63b9ac9
fix tests
riesentoaster Dec 17, 2024
92c3f08
Merge branch 'main' into add-label-mutationresult
riesentoaster Dec 17, 2024
6395df9
Merge branch 'main' into add-label-mutationresult
riesentoaster Dec 18, 2024
17c63fe
Merge branch 'main' into add-label-mutationresult
riesentoaster Dec 19, 2024
61120bf
Merge branch 'main' into add-label-mutationresult
riesentoaster Dec 19, 2024
8757a33
Merge branch 'main' into add-label-mutationresult
riesentoaster Dec 20, 2024
6d1090d
Merge branch 'main' into add-label-mutationresult
riesentoaster Dec 23, 2024
ff0a25b
Implement evaluate_filtered
riesentoaster Dec 24, 2024
d193c06
Add macros to libafl_bolts tuples for mapping and merging types (#2788)
riesentoaster Dec 23, 2024
d16ede3
libafl_cc: Automatically find llvm_ar path (#2790)
s1341 Dec 24, 2024
c6f5646
imemory_ondisk: Don't fail write under any circumstances if locking i…
s1341 Dec 24, 2024
4275e0c
Merge remote-tracking branch 'upstream/main' into add-label-mutationr…
riesentoaster Dec 24, 2024
28b5c4a
Revert changes to global Cargo.toml
riesentoaster Dec 24, 2024
60e188f
Hide std-dependent dependency behind std feature
riesentoaster Dec 24, 2024
68041b9
Fix example fuzzer
riesentoaster Dec 24, 2024
e3f530e
Rename constructor for filtered fuzzer
riesentoaster Dec 24, 2024
db994a4
Reorder generics alphabetically
riesentoaster Dec 24, 2024
d2dc266
Rename HashingMutator, add note to MutationResult about filtered fuzzers
riesentoaster Dec 24, 2024
bdade95
Improve StdFuzzer according to feedback
riesentoaster Dec 27, 2024
d2fda6b
rename hashing mutator
riesentoaster Dec 27, 2024
01c91fe
Fix english in comment
riesentoaster Dec 27, 2024
0e99d38
Merge branch 'main' into add-label-mutationresult
riesentoaster Dec 27, 2024
40aa07c
Cleanup of old PRs that break the CI
riesentoaster Dec 27, 2024
3499e64
Fix more CI bugs
riesentoaster Dec 27, 2024
06499cd
Code cleanup
riesentoaster Dec 27, 2024
22f98eb
Remove unnecessary comments
riesentoaster Dec 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion fuzzers/baby/baby_fuzzer_custom_executor/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,9 @@ authors = [
edition = "2021"

[features]
default = ["std"]
default = ["std", "bloom_input_filter"]
tui = ["libafl/tui_monitor"]
bloom_input_filter = ["std"]
std = []

[profile.dev]
Expand Down
4 changes: 4 additions & 0 deletions fuzzers/baby/baby_fuzzer_custom_executor/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,11 @@ pub fn main() {
let scheduler = QueueScheduler::new();

// A fuzzer with feedbacks and a corpus scheduler
#[cfg(not(feature = "bloom_input_filter"))]
let mut fuzzer = StdFuzzer::new(scheduler, feedback, objective);
#[cfg(feature = "bloom_input_filter")]
let mut fuzzer =
StdFuzzer::with_bloom_input_filter(scheduler, feedback, objective, 10_000_000, 0.001);

// Create the executor for an in-process function with just one observer
let executor = CustomExecutor::new(&state);
Expand Down
2 changes: 2 additions & 0 deletions libafl/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ std = [
"serial_test",
"libafl_bolts/std",
"typed-builder",
"fastbloom",
]

## Tracks the Feedbacks and the Objectives that were interesting for a Testcase
Expand Down Expand Up @@ -291,6 +292,7 @@ document-features = { workspace = true, optional = true }
clap = { workspace = true, optional = true }
num_enum = { workspace = true, optional = true }
libipt = { workspace = true, optional = true }
fastbloom = { version = "0.8.0", optional = true }

[lints]
workspace = true
Expand Down
2 changes: 1 addition & 1 deletion libafl/src/executors/inprocess/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -557,7 +557,7 @@ mod tests {
let mut mgr = NopEventManager::new();
let mut state =
StdState::new(rand, corpus, solutions, &mut feedback, &mut objective).unwrap();
let mut fuzzer = StdFuzzer::<_, _, _>::new(sche, feedback, objective);
let mut fuzzer = StdFuzzer::new(sche, feedback, objective);

let mut in_process_executor = InProcessExecutor::new(
&mut harness,
Expand Down
190 changes: 175 additions & 15 deletions libafl/src/fuzzer/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,11 @@

use alloc::{string::ToString, vec::Vec};
use core::{fmt::Debug, time::Duration};
#[cfg(feature = "std")]
use std::hash::Hash;

#[cfg(feature = "std")]
use fastbloom::BloomFilter;
use libafl_bolts::{current_time, tuples::MatchName};
use serde::Serialize;

Expand Down Expand Up @@ -138,6 +142,16 @@ pub trait EvaluatorObservers<E, EM, I, S> {

/// Evaluate an input modifying the state of the fuzzer
pub trait Evaluator<E, EM, I, S> {
/// Runs the input if it was (likely) not previously run and triggers observers and feedback and adds the input to the previously executed list
/// returns if is interesting an (option) the index of the new [`crate::corpus::Testcase`] in the corpus
fn evaluate_filtered(
&mut self,
state: &mut S,
executor: &mut E,
manager: &mut EM,
input: I,
) -> Result<(ExecuteInputResult, Option<CorpusId>), Error>;

/// Runs the input and triggers observers and feedback,
/// returns if is interesting an (option) the index of the new [`crate::corpus::Testcase`] in the corpus
fn evaluate_input(
Expand Down Expand Up @@ -242,13 +256,14 @@ pub enum ExecuteInputResult {

/// Your default fuzzer instance, for everyday use.
#[derive(Debug)]
pub struct StdFuzzer<CS, F, OF> {
pub struct StdFuzzer<CS, F, IF, OF> {
scheduler: CS,
feedback: F,
objective: OF,
input_filter: IF,
}

impl<CS, F, OF, S> HasScheduler<<S::Corpus as Corpus>::Input, S> for StdFuzzer<CS, F, OF>
impl<CS, F, IF, OF, S> HasScheduler<<S::Corpus as Corpus>::Input, S> for StdFuzzer<CS, F, IF, OF>
where
S: HasCorpus,
CS: Scheduler<<S::Corpus as Corpus>::Input, S>,
Expand All @@ -264,7 +279,7 @@ where
}
}

impl<CS, F, OF> HasFeedback for StdFuzzer<CS, F, OF> {
impl<CS, F, IF, OF> HasFeedback for StdFuzzer<CS, F, IF, OF> {
type Feedback = F;

fn feedback(&self) -> &Self::Feedback {
Expand All @@ -276,7 +291,7 @@ impl<CS, F, OF> HasFeedback for StdFuzzer<CS, F, OF> {
}
}

impl<CS, F, OF> HasObjective for StdFuzzer<CS, F, OF> {
impl<CS, F, IF, OF> HasObjective for StdFuzzer<CS, F, IF, OF> {
type Objective = OF;

fn objective(&self) -> &OF {
Expand All @@ -288,8 +303,8 @@ impl<CS, F, OF> HasObjective for StdFuzzer<CS, F, OF> {
}
}

impl<CS, EM, F, OF, OT, S> ExecutionProcessor<EM, <S::Corpus as Corpus>::Input, OT, S>
for StdFuzzer<CS, F, OF>
impl<CS, EM, F, IF, OF, OT, S> ExecutionProcessor<EM, <S::Corpus as Corpus>::Input, OT, S>
for StdFuzzer<CS, F, IF, OF>
where
CS: Scheduler<<S::Corpus as Corpus>::Input, S>,
EM: EventFirer<State = S>,
Expand Down Expand Up @@ -494,8 +509,8 @@ where
}
}

impl<CS, E, EM, F, OF, S> EvaluatorObservers<E, EM, <S::Corpus as Corpus>::Input, S>
for StdFuzzer<CS, F, OF>
impl<CS, E, EM, F, IF, OF, S> EvaluatorObservers<E, EM, <S::Corpus as Corpus>::Input, S>
for StdFuzzer<CS, F, IF, OF>
where
CS: Scheduler<<S::Corpus as Corpus>::Input, S>,
E: HasObservers + Executor<EM, Self, State = S>,
Expand Down Expand Up @@ -532,7 +547,48 @@ where
}
}

impl<CS, E, EM, F, OF, S> Evaluator<E, EM, <S::Corpus as Corpus>::Input, S> for StdFuzzer<CS, F, OF>
trait InputFilter<I> {
fn should_execute(&mut self, input: &I) -> bool;
}

/// A pseudo-filter that will execute each input.
#[derive(Debug)]
pub struct NopInputFilter;
impl<I> InputFilter<I> for NopInputFilter {
#[inline]
#[must_use]
fn should_execute(&mut self, _input: &I) -> bool {
riesentoaster marked this conversation as resolved.
Show resolved Hide resolved
true
}
}

/// A filter that probabilistically prevents duplicate execution of the same input based on a bloom filter.
#[cfg(feature = "std")]
#[derive(Debug)]
pub struct BloomInputFilter {
bloom: BloomFilter,
}

#[cfg(feature = "std")]
impl BloomInputFilter {
#[must_use]
fn new(items_count: usize, fp_p: f64) -> Self {
let bloom = BloomFilter::with_false_pos(fp_p).expected_items(items_count);
Self { bloom }
}
}

#[cfg(feature = "std")]
impl<I: Hash> InputFilter<I> for BloomInputFilter {
Copy link
Member

@tokatoka tokatoka Dec 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this I: Hash necessary? if not can you delete?
(always, keep the contraints minimal)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the bloom filter checks presence based on the hash value of the input.

#[inline]
#[must_use]
fn should_execute(&mut self, input: &I) -> bool {
!self.bloom.insert(input)
}
}

impl<CS, E, EM, F, IF, OF, S> Evaluator<E, EM, <S::Corpus as Corpus>::Input, S>
for StdFuzzer<CS, F, IF, OF>
where
CS: Scheduler<<S::Corpus as Corpus>::Input, S>,
E: HasObservers + Executor<EM, Self, State = S>,
Expand All @@ -549,7 +605,22 @@ where
+ UsesInput<Input = <S::Corpus as Corpus>::Input>,
<S::Corpus as Corpus>::Input: Input,
S::Solutions: Corpus<Input = <S::Corpus as Corpus>::Input>,
IF: InputFilter<<S::Corpus as Corpus>::Input>,
{
fn evaluate_filtered(
&mut self,
state: &mut S,
executor: &mut E,
manager: &mut EM,
input: <S::Corpus as Corpus>::Input,
) -> Result<(ExecuteInputResult, Option<CorpusId>), Error> {
if self.input_filter.should_execute(&input) {
self.evaluate_input(state, executor, manager, input)
} else {
Ok((ExecuteInputResult::None, None))
}
}

/// Process one input, adding to the respective corpora if needed and firing the right events
#[inline]
fn evaluate_input_events(
Expand All @@ -562,6 +633,7 @@ where
) -> Result<(ExecuteInputResult, Option<CorpusId>), Error> {
self.evaluate_input_with_observers(state, executor, manager, input, send_events)
}

fn add_disabled_input(
&mut self,
state: &mut S,
Expand All @@ -573,6 +645,7 @@ where
let id = state.corpus_mut().add_disabled(testcase)?;
Ok(id)
}

/// Adds an input, even if it's not considered `interesting` by any of the executors
fn add_input(
&mut self,
Expand Down Expand Up @@ -672,7 +745,7 @@ where
}
}

impl<CS, E, EM, F, OF, S, ST> Fuzzer<E, EM, S, ST> for StdFuzzer<CS, F, OF>
impl<CS, E, EM, F, IF, OF, S, ST> Fuzzer<E, EM, S, ST> for StdFuzzer<CS, F, IF, OF>
where
CS: Scheduler<S::Input, S>,
E: UsesState<State = S>,
Expand Down Expand Up @@ -796,17 +869,44 @@ where
}
}

impl<CS, F, OF> StdFuzzer<CS, F, OF> {
/// Create a new `StdFuzzer` with standard behavior.
pub fn new(scheduler: CS, feedback: F, objective: OF) -> Self {
impl<CS, F, IF, OF> StdFuzzer<CS, F, IF, OF> {
/// Create a new [`StdFuzzer`] with standard behavior and the provided duplicate input execution filter.
pub fn with_input_filter(scheduler: CS, feedback: F, objective: OF, input_filter: IF) -> Self {
Self {
scheduler,
feedback,
objective,
input_filter,
}
}
}

impl<CS, F, OF> StdFuzzer<CS, F, NopInputFilter, OF> {
/// Create a new [`StdFuzzer`] with standard behavior and no duplicate input execution filtering.
pub fn new(scheduler: CS, feedback: F, objective: OF) -> Self {
Self::with_input_filter(scheduler, feedback, objective, NopInputFilter)
}
}

#[cfg(feature = "std")] // hashing requires std
impl<CS, F, OF> StdFuzzer<CS, F, BloomInputFilter, OF> {
/// Create a new [`StdFuzzer`], which, with a certain certainty, executes each input only once.
///
/// This is achieved by hashing each input and using a bloom filter to differentiate inputs.
///
/// Use this implementation if hashing each input is very fast compared to executing potential duplicate inputs.
pub fn with_bloom_input_filter(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a with_input_filter method and then you just provide a BloomInputFilter there (or we keep this one as extra constructor)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a generic version and changed both new and with_bloom_input_filter to use this one.

scheduler: CS,
feedback: F,
objective: OF,
items_count: usize,
fp_p: f64,
) -> Self {
let input_filter = BloomInputFilter::new(items_count, fp_p);
Self::with_input_filter(scheduler, feedback, objective, input_filter)
}
}

/// Structs with this trait will execute an input
pub trait ExecutesInput<E, EM, I, S> {
/// Runs the input and triggers observers and feedback
Expand All @@ -819,8 +919,8 @@ pub trait ExecutesInput<E, EM, I, S> {
) -> Result<ExitKind, Error>;
}

impl<CS, E, EM, F, OF, S> ExecutesInput<E, EM, <S::Corpus as Corpus>::Input, S>
for StdFuzzer<CS, F, OF>
impl<CS, E, EM, F, IF, OF, S> ExecutesInput<E, EM, <S::Corpus as Corpus>::Input, S>
for StdFuzzer<CS, F, IF, OF>
where
CS: Scheduler<<S::Corpus as Corpus>::Input, S>,
E: Executor<EM, Self, State = S> + HasObservers,
Expand Down Expand Up @@ -913,3 +1013,63 @@ where
unimplemented!("NopFuzzer cannot fuzz");
}
}

#[cfg(all(test, feature = "std"))]
mod tests {
use core::cell::RefCell;

use libafl_bolts::rands::StdRand;

use super::{Evaluator, StdFuzzer};
use crate::{
corpus::InMemoryCorpus,
events::NopEventManager,
executors::{ExitKind, InProcessExecutor},
inputs::BytesInput,
schedulers::StdScheduler,
state::StdState,
};

#[test]
fn filtered_execution() {
let execution_count = RefCell::new(0);
let scheduler = StdScheduler::new();
let mut fuzzer = StdFuzzer::with_bloom_input_filter(scheduler, (), (), 100, 1e-4);
let mut state = StdState::new(
StdRand::new(),
InMemoryCorpus::new(),
InMemoryCorpus::new(),
&mut (),
&mut (),
)
.unwrap();
let mut manager = NopEventManager::new();
let mut harness = |_input: &BytesInput| {
*execution_count.borrow_mut() += 1;
ExitKind::Ok
};
let mut executor =
InProcessExecutor::new(&mut harness, (), &mut fuzzer, &mut state, &mut manager)
.unwrap();
let input = BytesInput::new(vec![1, 2, 3]);
assert!(fuzzer
.evaluate_input(&mut state, &mut executor, &mut manager, input.clone())
.is_ok());
assert_eq!(1, *execution_count.borrow()); // evaluate_input does not add it to the filter

assert!(fuzzer
.evaluate_filtered(&mut state, &mut executor, &mut manager, input.clone())
.is_ok());
assert_eq!(2, *execution_count.borrow()); // at to the filter

assert!(fuzzer
.evaluate_filtered(&mut state, &mut executor, &mut manager, input.clone())
.is_ok());
assert_eq!(2, *execution_count.borrow()); // the harness is not called

assert!(fuzzer
.evaluate_input(&mut state, &mut executor, &mut manager, input.clone())
.is_ok());
assert_eq!(3, *execution_count.borrow()); // evaluate_input ignores filters
}
}
Loading
Loading