This project contains the source code used for the evaluation of the Systematic Assessment of Fuzzers using Mutation Testing paper, see here if you want to cite the paper. Note that this framework only supports Linux systems at the moment.
The goal of this project is to provide a framework to compare fuzzers using mutation analysis. Fuzzers are tools that implement fuzzing / fuzz testing to find bugs in software. Fuzzing can be likened to generating lots of random inputs to run on the program under test.
Mutation testing is an approach to judge how well testing methods perform in finding bugs. The basic idea is to add (small) modifications to a program and check if a modification is detected by the testing method. The modifications that are not killed can be used to judge the performance of the testing method.
Both fuzzing and mutation testing are known for their high computation requirements, this framework implements two methods to reduce this requirement. One method is to reduce the number of mutations that need to be evaluated, this is done by combining mutations that do not interfere with each other into one "supermutant". This allows evaluation of all mutations contained in a supermutant in one fuzzing run (Stage 1). The other method is to reduce the time needed that fuzzers require to get results. This is done by splitting fuzzing into two parts, one longer session on the unmutated executable, to give the fuzzer as much of a head start as it can get. This allows shorter sessions for each supermutant that needs to be evaluated (Stage 2).
It is recommended to use a current Linux distribution, which are supported by the two main requirements docker and hatch.
Hardware requirements depend on the the number of CPU cores that are available and should be used. The RAM requirements mostly depends on the subject, as a rule of thumb, we would recommend around 1GB per instance.
The hard drive space requires also depends on the subjects and fuzzers. Most space will be used for the docker images, seed corpus, and resulting databases. We recommend around 100GB available space for evaluations with multiple subjects and fuzzers.
To run this framework two dependencies are required.
First, the required Python packages are managed using hatch. See Preparing the Environment on how to use it. The configuration can be found in pyproject.toml, but we discourage manual setups.
The other requirement is docker
and that the user is a member of the docker
group.
The entry point for all commands is src/mua_fuzzer_benchmark/eval.py
and the
working directory is expected to be the project root directory. All commands
and arguments to them are documented, for help use:
src/mua_fuzzer_benchmark/eval.py -h
After preparing the environment the usual steps for an evaluation are first to do coverage fuzzing (Stage 1) and then evaluating on supermutants (Stage 2).
Fuzzing requires some specific configurations, the recommended configuration
can be found in the set_proc_sys_parameters.sh
file. Please check the
file to see if those changes are acceptable and modify if not.
Additionally, the framework requires a few python packages. The python
environment is managed through hatch.
All that needs to be installed is hatch, then hatch shell
will set up the
packages. We strongly recommend this over a manual setup.
source ./set_proc_sys_parameters.sh # source the script to allow ulimit to work
hatch shell # enter the python environment
Executing the second command opens a shell in the virtual environment that has all dependencies installed. Afterwards, the following commands can be run in this shell session.
The first step for the comparison is to create a seed corpus covering as much of the program under test as the fuzzer can achieve. Once this seed corpus has been established, it is reasonable to require shorter runs to evaluate on the mutants.
To start a coverage fuzzing run the coverage_fuzzing
command can be used.
See below for an example. See src/mua_fuzzer_benchmark/eval.py coverage_fuzzing -h
for help. For a description on environment variables see the file:
src/mua_fuzzer_benchmark/constants.py
.
Note that an initial set of seeds are required, they are expected to be placed
the in a folder identical to the prog variant name, so for the following
command the seed files would be placed under:
tmp/seeds/seeds_minimal/woff2_new
.
Also, note that the fuzz-time
is in minutes. Of the instances, the median run
based on covered mutations will be moved into a separate directory.
src/mua_fuzzer_benchmark/eval.py coverage_fuzzing \
--fuzzers libfuzzer aflpp honggfuzz \
--progs woff2_new \
--fuzz-time $((60 * 48)) \
--seed-dir tmp/seeds/minimal \
--result-dir tmp/coverage \
--instances 13
The second step for the comparison is evaluating how many mutants are killed
by a fuzzer. This can be done using the eval
command,
see src/mua_fuzzer_benchmark/eval.py coverage_fuzzing -h
for help.
In the remaining part of this section, the commands for the main experiments of the paper are repeated to show some example usages.
As said, the eval
command provides the entry point to evaluate mutations.
See below for an example. Note that the seed-dir
now points to the median
run created by the coverage_fuzzing
command.
src/mua_fuzzer_benchmark/eval.py eval \
--fuzzers libfuzzer aflpp honggfuzz \
--progs woff2_new \
--fuzz-time $((60 * 1)) \
--seed-dir tmp/coverage/median_runs/ \
--result-path data/basic/stats_all.db
After the evaluation, the result database will be copied to result-path
.
Based on the results of the initial evaluation, the bitcode files and
supermutants can be reused by using the rerun arguments to eval
.
If only a subset of mutations should be retried, this requires a file specifying
mutations, which can be created using the generate_rerun_file
command.
See below for an example:
src/mua_fuzzer_benchmark/eval.py generate_rerun_file \
--db data/basic/stats_all.db \
--out-file rerun_file_for_asan.json \
--untried no \
--covered yes \
--skip-timeout yes \
--skip-killed no \
--skip-crashed yes \
--mode keep
The following command shows how to start a rerun eval, this time building the
program with ASan. (To be clear: MUT_BUILD_ASAN
can also be used without
--rerun
. Also, --rerun
ensures that the same supermutants that have been
generated for the --rerun <db>
will be started again.)
MUT_BUILD_ASAN=1 src/mua_fuzzer_benchmark/eval.py eval \
--fuzzers libfuzzer aflpp honggfuzz \
--progs woff2_new \
--fuzz-time $((60 * 1)) \
--seed-dir tmp/coverage/median_runs/ \
--rerun data/basic/stats_all.db \
--rerun-mutations rerun_file_for_asan.json \
--result-path data/asan/stats_all.db
Just to complete the experiments of the paper, this example shows how to run the mutations still not killed after the ASan experiment again (for 24 hours).
src/mua_fuzzer_benchmark/eval.py generate_rerun_file \
--db data/asan/stats_all.db \
--out-file rerun_file_24_hours.json \
--untried no \
--covered yes \
--skip-timeout yes \
--skip-killed yes \
--skip-crashed yes \
--mode single
MUT_BUILD_ASAN=1 src/mua_fuzzer_benchmark/eval.py eval \
--fuzzers libfuzzer aflpp honggfuzz \
--progs woff2_new \
--fuzz-time $((60 * 24)) \
--seed-dir tmp/coverage/median_runs/ \
--rerun data/asan/stats_all.db \
--rerun-mutations rerun_file_24_hours.json \
--result-path data/24_hours/stats_all.db
The plots as shown in the paper can be reproduced using the Makefile.
This requires that the databases are prepared for plotting first using:
src/mua_fuzzer_benchmark/eval.py prepare_db --db <db>
This needs to be done for all three databases previously created:
src/mua_fuzzer_benchmark/eval.py prepare_db --db data/basic/stats_all.db
src/mua_fuzzer_benchmark/eval.py prepare_db --db data/asan/stats_all.db
src/mua_fuzzer_benchmark/eval.py prepare_db --db data/24_hours/stats_all.db
Note that the Makefile expects all the databases at exactly those locations and
the seeds to be under tmp/coverage
just as the previous commands set up.
Finally, to generate the plots just run make
from the project root.
This part of the usage will likely be changed after artifact evaluation.
There are three expected ways to extend the current state of the framework. Adding new fuzzers, adding new programs, and adding or changing mutations.
All fuzzers are located in the directory dockerfiles/fuzzers/
, each fuzzer
is located in a separate directory, and the directory name is used to identify
the fuzzer. It is recommended to look at existing fuzzer configurations and
follow their setup when adding a new fuzzer.
Note that dockerfiles/fuzzers/system/
is used as the base image.
Four files need to be provided to use a fuzzer (under dockerfiles/fuzzers/<fuzzer name>/
):
- Dockerfile: The docker build command is run from the project root dir. This Dockerfile is used to compile the fuzzer and copy the following two files into the created image. The image will be used to create the containers used for fuzzing.
- eval.sh: Run inside the docker container for coverage fuzzing and
evaluating on mutants. This script builds the instrumented binary and
executes the fuzzing run. Arguments are path to the bitcode file in the
docker container, compile arguments, and path to the seed corpus directory.
Additionally, environment variables are passed:
DICT_PATH
contains the path to the dictionary for the program if available. AlsoMUT_WITH_ASAN
orMUT_WITH_MSAN
have the value"1"
if ASan or MSan should be used. - minimize.sh: Similar to
eval.sh
this script is run inside the docker container to use the fuzzers minimization algorithm. Arguments are: the path to the bitcode file, compile arguments, path to the directory containing the inputs that should be minimized, and the path to the output directory. - config.json: Additional information about the fuzzer used to decide which queue inputs and crash inputs the fuzzer generates and should be used during evaluation.
All programs are located in the directory dockerfiles/programs/
, each program
is located in a separate directory, and the directory name is part of the
identifier for a program variant. A program variant is one configuration for
a program, see the description of the config.json file for details.
Two files are required to add a program (additional files are allowed, see
Dockerfile for details):
- Dockerfile: In the Dockerfile the program is built for all variants.
Note that other than for the fuzzers, the
docker build
command is run in the program folder. The contents of the/home/mutator/sample/
dir inside the docker image are copied totmp/programs/<program name>
and shared with the containers using docker's volume flag. Two files are required to be generated during the build process, a bitcode file of the program without a main function (or with a weakly linked main function, though this is untested). The other file is the bitcode file compiled to an executable binary included a main function, this file is used as the unmutated original binary (this file could also be generated by the framework using the bitcode file and a common main function, though this is not implemented yet). - config.json: This file contains the configurations for the different
variants of a program. It contains a dictionary with the variant name
mapping to a dictionary of configuration parameters. The program variant
name is created as follows:
<folder name> + "_" + <variant name>
. Following configuration parameters are expected: -bc_compile_args
: Compiler arguments that should be used for compilation for all artifacts. Each argument is a dictionary containing:val
: the string of the argument andaction
: none or "prefix_workdir" if the path to the workdir should be added as a prefix to theval
. -bin_compile_args
: Compiler arguments that should be used for compilation of the binary. Each argument follows the same rules as forbc_compile_args
. -is_cpp
: If the compilation should be done with a C++ compiler (clang++). -dict
: Path to a dictionary file that is passed to the fuzzers, none if no dictionary should be provided. The path is relative to the/home/mutator/sample/
dir. -orig_bin
: Path to the original unmutated binary. The path is relative to the/home/mutator/sample/
dir. -orig_bc
: Path to the bitcode file. The path is relative to the/home/mutator/sample/
dir. -omit_functions
: Names of functions that should be excluded from being mutated, usually ["LLVMFuzzerTestOneInput"].
The file mutation_doc.json contains documentation about the different mutation types implemented in this project as a machine-readable JSON file.
Changing mutations requires additional setup shown below.
# use the python script mutator-docker-wrapper.py
# first build the docker containers
python3 mutator-docker-wrapper.py -b
# then connect to the container
python3 mutator-docker-wrapper.py -a
# [optional] consult the script's help output for rebuilding and similar
# (will be faster than the initial build)
python3 mutator-docker-wrapper.py -h
# run from root directory
# llvm installation on Mac:
brew install llvm@11
# llvm installation on linux:
echo deb http://apt.llvm.org/focal/ llvm-toolchain-focal-11 main >> /etc/apt/sources.list && \
wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
apt-get update && apt-get upgrade -y
apt-get install -y gcc-10 g++-10 gcc-10-plugin-dev gcc-10-multilib \
libc++-10-dev gdb lcov
apt-get install -y clang-11 clang-tools-11 libc++1-11 libc++-11-dev \
libc++abi1-11 libc++abi-11-dev libclang1-11 libclang-11-dev \
libclang-common-11-dev libclang-cpp11 libclang-cpp11-dev liblld-11 \
liblld-11-dev liblldb-11 liblldb-11-dev libllvm11 libomp-11-dev \
libomp5-11 lld-11 lldb-11 llvm-11 llvm-11-dev llvm-11-runtime llvm-11-tools
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 0
update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 0
rm -rf /var/cache/apt/archives/*
ln /bin/clang-11 /bin/clang
ln /bin/clang++-11 /bin/clang++
ln /bin/opt-11 /bin/opt
ln /bin/llvm-config-11 /bin/llvm-config
ln /bin/llvm-ar-11 /bin/llvm-ar
# compiles the project and installs it under build/install
export LD_LIBRARY_PATH=${mutation_tool_home}/build/install/LLVM_Mutation_Tool/lib/
gradle build
General invoke:
# invokes the mutation pipeline, check for details on how to invoke the project in detail
./run_mutation.py [-bc] [-ll] [-bn] [-cpp] [-m <int>] [-ml <int>*] <path-to-subject.c> (e.g. ./samples/simple_malloc/malloc.c)
Explanation of optional arguments (at least one must be chosen, otherwise the output would be empty):
-bc: Keeps the mutated bitcode files.
-ll: Keeps the mutated bitcode files in human readable form.
-bn: Generates runnable binaries if possible.
-m <int>: Defines the mutation ID that should be applied, -1 for all mutations,
-2 or left out for just generation the mutationlocations file and a binary that prints
covered mutations to the defined folder.
-m <int>: Defines a list of mutation IDs that should be applied at once,
left out for just generation the mutationlocations file and a binary that prints
covered mutations to the defined folder.
-cpp: Uses clang++ for compilation instead of clang.
If a new mutant is generated all already generated mutants are deleted!
The mutated programs have code injected which prints if the mutation was triggered. It is possible to influence the output by defining environment variables:
export TRIGGERED_OUTPUT="Some output." # can contain any string which will be printed to the command line
and written to the file TRIGGERED_FILE if it is defined and could be created; a default value is printed if
this environment variable is not defined
export TRIGGERED_FOLDER="some/file/to/report/mutation/trigger_signal" # a path to a folder which will for each mutation UID
contain a file which indicates that the mutation was triggered. If not defined the files will be written to ./triggered_signal
Also, if one wants to test which mutations could be covered, a binary is generated when producing the mutants
which will print all covered mutation locations:
Invoke the binary *.opt_mutate
(e.g. samples/simple_malloc/malloc.c.opt_mutate) in the respective subject folder, it will print all found locations during
the run as explained above.
# run in root directory to delete all build products
# can also be used if the build does not correctly update or another strange behavior is observed
gradle clean
# run in root directory to delete all generated mutation files in samples
sh clean_mutations_in_samples.sh
If you want to cite this project, we would recommend the following bibtex entry:
@inproceedings{goerz2023muafuzzerbenchmark,
location = {Anaheim, {CA}, {USA}},
title = {Systematic Assessment of Fuzzers using Mutation Analysis},
url = {https://www.usenix.org/conference/usenixsecurity23/presentation/gorz},
booktitle = usenix-security,
author = {Philipp Görz and Björn Mathis and Keno Hassler and Emre Güler and Thorsten Holz and Andreas Zeller and Rahul Gopinath},
date = {2023-08}
}
mua-fuzzer-benchmark
is distributed under the terms of the MIT license.