mishegos

A differential fuzzer for x86 decoders.

Read more about mishegos in its accompanying blog post and academic publication (paper | recording | slides).

@InProceedings{woodruff21differential,
  author       = "William Woodruff and Niki Carroll and Sebastiaan Peters",
  title        = "Differential analysis of x86-64 instruction decoders",
  booktitle    = "Proceedings of the Seventh Language-Theoretic Security Workshop~({LangSec}) at the {IEEE} Symposium on Security and Privacy",
  year         = "2021",
  month        = "May"
}

Usage

Start with a clone, including submodules:

git clone --recurse-submodules https://github.com/trailofbits/mishegos

Building

mishegos is most easily built within Docker:

docker build -t mishegos .

Alternatively, you can try building it directly.

Make sure you have binutils-dev (or however your system provides libopcodes) installed:

make
# or
make debug

Build specific workers by passing a space-delimited list as the WORKERS varable:

WORKERS="bfd capstone" make worker

Running

Run the fuzzer for a bit:

./src/mishegos/mishegos ./workers.spec > /tmp/mishegos

mishegos checks for three environment variables:

V=1 enables verbose output on stderr
D=1 enables the "dummy" mutation mode for debugging purposes
M=1 enables the "manual" mutation mode (i.e., read from stdin)
MODE=mode can be used to configure the mutation mode in the absence of D and M
- Valid mutation modes are sliding (default), havoc, and structured

Convert mishegos's raw output into JSONL suitable for analysis:

./src/mish2jsonl/mish2jsonl /tmp/mishegos > /tmp/mishegos.jsonl

mish2jsonl checks for V=1 to enable verbose output on stderr.

Run an analysis/filter pass group on the results:

./src/analysis/analysis -p same-size-different-decodings < /tmp/mishegos.jsonl > /tmp/mishegos.interesting

Generate an ~~ugly~~ pretty visualization of the filtered results:

./src/mishmat/mishmat < /tmp/mishegos.interesting > /tmp/mishegos.html
open /tmp/mishegos.html

Tip: The HTML file that mishmat generates could be hundreds of megabytes large, which will likely result in a bad browser viewing experience. Using the split tool, you can create multiple smaller HTML files with a specified number of entries per file (10,000 in the following example) and load each of them separately:

mkdir /tmp/mishegos-html
split -d --lines=10000 - /tmp/mishegos-html/mishegos_ \
    --additional-suffix='.html' --filter='./src/mishmat/mishmat > $FILE' \
    < /tmp/mishegos.interesting

Contributing

We welcome contributors to mishegos!

A guide for adding new disassembler workers can be found here.

Performance notes

All numbers below correspond to the following run:

V=1 timeout 60s ./src/mishegos/mishegos ./workers.spec > /tmp/mishegos

Outside Docker:

On a Linux desktop (Ubuntu 20.04, Ryzen 5 3600, 32GB DDR4):
- Commit d80063a
- 8 workers (no udis86) + 1 mishegos fuzzer process
- 8.7M outputs/minute
- 9 cores pinned

TODO

Performance improvements
- Break cohort collection out into a separate process (requires re-addition of semaphores)
- Maybe use a better data structure for input/output/cohort slots
Add a scaling factor for workers, e.g. spawn N of each worker
Pre-analysis normalization (whitespace, immediate representation, prefixes)
Analysis strategies:
- Filter by length, decode status discrepancies
- Easy: lexical comparison
- Easy: reassembly + effects modeling (maybe with microx?)
Scoring ideas:
- Low value: Flag/prefix discrepancies
- Medium value: Decode success/failure/crash discrepancies
- High value: Decode discrepancies with differing control flow, operands, maybe some immediates
Visualization ideas:
- Basic but not really basic: some kind of mouse-over differential visualization

License

mishegos is licensed and distributed under the Apache v2.0 license. Contact us if you’re looking for an exception to the terms.

Name		Name	Last commit message	Last commit date
Latest commit History 1,062 Commits
.github		.github
docs		docs
src		src
.clang-format		.clang-format
.editorconfig		.editorconfig
.gitignore		.gitignore
.gitmodules		.gitmodules
CODEOWNERS		CODEOWNERS
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
workers.spec		workers.spec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mishegos

Usage

Building

Running

Contributing

Performance notes

TODO

License

About

Releases

Packages

Contributors 15

Languages

License

trailofbits/mishegos

Folders and files

Latest commit

History

Repository files navigation

mishegos

Usage

Building

Running

Contributing

Performance notes

TODO

License

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 15

Languages

Packages