Skip to content

Latest commit

 

History

History
143 lines (120 loc) · 7.08 KB

README.md

File metadata and controls

143 lines (120 loc) · 7.08 KB

Original README of go-fuzz has been renamed to README.go-fuzz.md

Ti-fuzz: Fuzz TiDB!

Usage

Prerequisites

  1. Install deps
    # mysqld is required, version >= 8
    apt install mysql-server
  2. Get source trees
    # To allow `go get` to access private repos
    # WARNING: access permission to PingCAP-QE is required
    go env -w GOPRIVATE=github.com/pragmatwice/go-squirrel
    git config --global url.git@github.com:.insteadOf https://github.com/
    # Clone modified TiDB that supports fuzzing
    git clone git@github.com:oraluben/tidb.git
    # Clone Ti Fuzz
    git clone git@github.com:oraluben/go-fuzz.git

Build

  1. Build Ti Fuzz tools
    cd <oraluben/go-fuzz>
    go build -o ti-fuzz-build ./go-fuzz-build
    go build -o ti-fuzz-main ./go-fuzz
    
  2. Build the instrumented TiDB
    cd <oraluben/tidb>/tidb-server/fuzz
    <oraluben/go-fuzz/ti-fuzz-build> -o tidb-fuzz.zip
    

Config

There are two sql files required for fuzzing TiDB:

  1. Initial SQL file: (example)
    This SQL file is used to initialize the fuzzing, so it is the first seed of the mutation procedure. Usually, it is composed of some DDLs such as CREATE TABLE followed by some DMLs like INSERT, and the last statement of this file must be a SELECT or set operation statement, i.e. UNION.
    (Because the only supported statement to mutate is select or set operation statements until now, and WITH clauses is also supported)
  2. Mutation Library SQL file: (example)
    This SQL file is used to control what can be mutate to: all nodes of statements in the file can be selected to replace existing nodes in seeds.

More examples can be found in every test directory of ti-fuzz-corpus.

Run

Usage of ti-fuzz-main:
  -bin string
        test binary built with go-fuzz-build
  -connectiontimeout duration
        time limit for worker to try to connect coordinator (default 1m0s)
  -coordinator string
        coordinator mode (value is coordinator address)
  -covercounters
        use coverage hit counters (default true)
  -dumpcover
        dump coverage profile into workdir
  -dup
        collect duplicate crashers
  -func string
        function to fuzz
  -http string
        HTTP server listen address (coordinator mode only)
  -init-path string
        initial SQL file path, include ddl (create) and dml (insert, select) to initialize seed
  -lib-path string
        library SQL file path, include select statement to be mutated to
  -minimize duration
        time limit for input minimization (default 1m0s)
  -procs int
        parallelism level (default 1)
  -rm
        remove TiDB and MySQL data dir after shutdown instance
  -sonar
        use sonar hints
  -timeout int
        test timeout, in seconds (default 10)
  -v int
        verbosity level
  -workdir string
        dir with persistent work data (default ".")
  -worker string
        worker mode (value is coordinator address)

For basic usage, you can type ti-fuzz-main -init-path init.sql -lib-path lib.sql -rm -bin tidb-fuzz.zip, with the instrumented TiDB generated by ti-fuzz-build, the initial sql and the mutation library sql file in the current directory.

Once started, ti-fuzz-main will firstly create a TiDB instance and a MySQL instance, then execute the initial SQL for both instances. And next, it will continuously mutate the select statement to another statement with coverage guided, compare select results of both instances and report the different results. Every few seconds Ti Fuzz prints logs to stderr of the form:

2021/06/28 10:00:00 workers: 1, corpus: 233 (1min ago), crashers: 3,
     restarts: 1/8027, execs: 120919 (214/sec), cover: 182746, uptime: 10m39s

Where workers means number of tests running in parallel (set with -procs flag). corpus is current number of interesting inputs the fuzzer has discovered, time in brackets says when the last interesting input was discovered. crashers is number of discovered bugs (check out ./crashers dir). restarts is the rate with which the fuzzer restarts test processes. execs is total number of query executions, and the number in brackets is the average speed of query executions. cover is number of bits set in a hashed coverage bitmap, if this number grows fuzzer uncovers new lines of code. And finally uptime is uptime of the process.

The execution will never stop except SIGINT or other signal is sent to the process of Ti Fuzz. Generally, 12 hours are suitable for most cases.

Analyze results

Some directories will be generated during execution of Ti Fuzz:

  • corpus: a.k.a. seeds (one per file) generated by Ti Fuzz
  • crashers: generated while TiDB crashes or query results between TiDB and MySQL are different
    • <hash>: the input SQL of this crasher.
    • <hash>.output: the output (panic message of TiDB) of this crasher
  • suppressions: for filtering same errors

There are several types of the result reported by Ti Fuzz (can be found in crashers/*.output):

  • Both Error: Both TiDB and MySQL give an error message for a same input, but these errors are not same.
    panic: [both err] tidb: ..., mysql: ...
  • One Side Error: Either TiDB or MySQL give an error message, another instance give the result (not error).
    panic: [one side err] tidb: ..., mysql: ...
  • Metainfo Difference: Both TiDB and MySQL give a non-error result, but the column types (n=1), column name (n=2), number of rows (n=3) or number of columns (n=4) of both results has some difference.
    panic: [metainfo diff (n)] tidb: ..., mysql: ...
  • Data Difference: Both TiDB and MySQL give a non-error result and there is no metainfo difference, but there are some difference on specific values in the results.
    panic: [data diff]: tidb: ..., mysql: ...

The SQL statements generated by Ti Fuzz can be complex, containing many subqueries and join clauses, you can use our reduce tool to simplify these statements for reproducing or analyzing root causes.

Dump coverage & Visualization

go-fuzz ... -dumpcover will generate coverprofile go tool cover -html=coverprofile

Note: go-fuzz will not always generate valid coverage file for go tool cover, you might need sed -i"" -e '/0.0,1.1/d' coverprofile (MacOS) or sed -i '/0.0,1.1/d' coverprofile (Linux) before generating HTML coverage report.

Q&A

  1. How to run faster?
    You can use multiple process mode by -procs N, where N is the number of processes

  2. Can I temporarily stop Ti Fuzz and resume it after a while?
    Yes! All generated seeds will be saved in ./corpus and the next execution will quickly revert to the previous coverage.

  3. Why does Ti Fuzz report testee crashed on ddl?
    You should check your initial SQL file, where some crashes happened.

  4. Why does Ti Fuzz report error while mutating ...?
    Something is wrong in the mutator, please report this error to us!