Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fuzzy tests #61

Closed
matklad opened this issue Sep 8, 2018 · 24 comments
Closed

Fuzzy tests #61

matklad opened this issue Sep 8, 2018 · 24 comments
Labels
E-medium fun A technically challenging issue with high impact

Comments

@matklad
Copy link
Member

matklad commented Sep 8, 2018

There are several things that parser self-checks for during parsing:

  • it might get stuck during error recovery code
  • Marker might be dropped without completing or abandoning code
  • It might produce a tree with invalid block structure code

We need to fuzz/property test the parser to verify that this does not happen in practice.

To that end, we need to implement:

  • fuzzing with arbitrary &str inputs: will be most useful for lexer probably?
  • fuzzing with arbitrary (invalid) Rust code. Arb Rust code might be acquired by taking a sample of real Rust code and applying random edits to it.
@matklad matklad added help wanted E-medium fun A technically challenging issue with high impact labels Sep 8, 2018
killercup added a commit to killercup/rust-analyzer that referenced this issue Sep 8, 2018
As described in rust-lang#61, fuzz testing some parts of this would be ~~fun~~
helpful. So, I started with the most trivial fuzzer I could think of:
Put random stuff into File::parse and see what happens.

To speed things up, I also did

    cp src/**/*.rs fuzz/corpus/parser/

in the `crates/libsyntax2/` directory (running the fuzzer once will
generate the necessary directories).
@killercup
Copy link
Member

Made a small PR with the most trivial fuzzer: #63. With the rs files from the libsyntax2 crate as corpus it quickly goes up to a coverage of 20199 (probably code paths but I don't know libfuzzer well enough; it's a high number compared to other fuzz targets, though).

Aaaaan it even found a crash! https://gist.github.com/killercup/a0be6701333a635de9dec6cffc89aedb

@killercup
Copy link
Member

Two other interesting approaches: vegard/prog-fuzz@c80b1a7 and creating edits from commits

@matklad
Copy link
Member Author

matklad commented Sep 8, 2018

Aaaaan it even found a crash!

That's a good sign! Today I've made sure that parser deals with rust-lang/rust, with fail-tests and all, so this failure must be non-trivial.

I'll write a fix shortly with a description of how to fix these things: I expect this won't be the sole fuzzing error :)

With the rs files from the libsyntax2 crate as corpus it quickly goes up to a coverage of 20199

As a bonus objective, I think it makes sense to add cargo gen-fuzz-corpus path-to-dir-with-sources command to tools, so that we can easily generate a corpus on CI using libsynax sources.

@killercup
Copy link
Member

Oh yes it most definitely is a non-trivial failure, it took the fuzzer almost 3 minutes to find – compared to most other projects where the first failures show up after 3 seconds! ;)

This is just the very first stab at things! I think I'll have a few hours to kill waiting at an airport tomorrow and might just integrate this with you tools crate and also throw other fuzzers at it. I've been meaning to make fuzzer definitions more generic anyway. And maybe after I've understood how to use libsyntax2 I can also use it to write a tool to auto-generate fuzzers for all functions that take a &[u8] or &str as input!

@matklad
Copy link
Member Author

matklad commented Sep 8, 2018

@killercup is it running the parser with debug-assertions on? One check ({} block validity) is predicated on them. Ideally I think it should use something like --release + debug_assertion ?

And maybe after I've understood how to use libsyntax2 I can also use it to write a tool to auto-generate fuzzers for all functions that take a &[u8] or &str as input!

Yep, I think libsyntax2, even in its current state, might be a good tool for the job (although right now syn might be better though?) Like, in theory you could even make an intention out of it, so that you place the cursor over the function in the editor, press ctrl+., and the fuzzer is auto-generated :) That's probably too special case to be an intention, but it's a fun possibility, and, super long term, custom intentions project-specific intentions are definitely on the roadmap

bors bot added a commit that referenced this issue Sep 8, 2018
63: Add trivial fuzzer for parser r=matklad a=killercup

As described in #61, fuzz testing some parts of this would be ~~fun~~
helpful. So, I started with the most trivial fuzzer I could think of:
Put random stuff into File::parse and see what happens.

To speed things up, I also did

    cp src/**/*.rs fuzz/corpus/parser/

in the `crates/libsyntax2/` directory (running the fuzzer once will
generate the necessary directories).

Co-authored-by: Pascal Hertleif <killercup@gmail.com>
@matklad
Copy link
Member Author

matklad commented Sep 8, 2018

Oh, one more useful thing to do would be to add a short note to readme, explaning how to actually run this thing :)

@matklad
Copy link
Member Author

matklad commented Sep 8, 2018

Ah, the delicious taste of dog food: I've copy-pasted your example as a test, and of course the plugin is now half-broken, because the indexing thread panics :D

@matklad
Copy link
Member Author

matklad commented Sep 8, 2018

Added a fix with a long explanation of what actually is going on: #64 (second commit msg).

If your fuzzers find more errors like these, you now know how to fix 'em!

@killercup
Copy link
Member

Wow, a5c333c has a beautiful commit message!

I have indeed found another crash (output). I'll see if I can fix it later this afternoon!

@killercup
Copy link
Member

killercup commented Sep 9, 2018

is it running the parser with debug-assertions on?

Totally forgot to answer this one! Yes:

     Running `fuzz/target/x86_64-apple-darwin/debug/parser -artifact_prefix=/Users/pascal/Projekte/tools/libsyntax2/crates/libsyntax2/fuzz/artifacts/parser/ fuzz/artifacts/parser/`

And good point. I'll also run it in release to see if there is a unknown dependency on this somewhere (or a things that fails differently without overflow checks)!

Edit: Yes, it also crashes with --release (also atom_pat -> err_recover -> at -> nth). (For added fun, the fuzzer changed a whole bunch of bytes to \x00, so I rcat --quoted the example input to be usable as a rust binary string.)

@matklad
Copy link
Member Author

matklad commented Sep 9, 2018

. I'll also run it in release to see if there is a unknown dependency on this somewhere (or a things that fails differently without overflow checks)!

I've refactored code a bit, so that for fuzzing we always do that block check, even in release mode:
ba4a697

@killercup
Copy link
Member

killercup commented Sep 9, 2018 via email

@matklad
Copy link
Member Author

matklad commented Sep 9, 2018

I have indeed found another crash

BTW, I am seeing the same match_arm_list in the trace (https://gist.github.com/killercup/254c98a31d921972ecc88e2c3c35ecad#file-gistfile1-txt-L28), so I suspect it might be the same crash.

@YaLTeR
Copy link

YaLTeR commented Sep 12, 2018

Ran the fuzzer and found this: https://gist.github.com/YaLTeR/2d0c734a0cca830e136a59af3891f0b2

EDIT: and another one, although this contains match_arm_list so it might be the same as the other crash in this thread: https://gist.github.com/YaLTeR/16bc702d06a76f8192bcb2e17d9cd0ae

@matklad
Copy link
Member Author

matklad commented Sep 12, 2018

Thanks! First one fixed in b6f8037. Letme take a look at the second one...

EDIT: the second one is actually the same. It's not an match_arm_list, it's a slice_pat though. Refactored code a bit to also have pat_list in that case.

@DJMcNab
Copy link
Contributor

DJMcNab commented Dec 31, 2018

I think the current action on this is just to run these every few weeks for a little bit to see if there's any code which can trigger a panic. I wonder if there's any neat way to remind us of this automatically - any ideas?

@matklad
Copy link
Member Author

matklad commented Dec 31, 2018

I wonder if there's any neat way to remind us of this automatically - any ideas?

A travis cron job?

@matklad
Copy link
Member Author

matklad commented Dec 31, 2018

So yeah, there are two bits I want to add to the current setup to close the issue

  • add a cargo fuzz-tests task (which should trick me to finally run this on my machine :) )
  • add an automated way to run fuzzing on CI. A daily cron job to run fuzzing for half an hour seems ideal.

A further improvement would be a high-level fuzzing, on the ra_analysis level: creating an (in-memory) rust project and running random diagnostics, completions and edits on it, but that is definitelly a separate issue.

bors bot added a commit that referenced this issue Dec 31, 2018
393: Add a fuzzing subcommand r=matklad a=DJMcNab

Part of #61 (comment).

Co-authored-by: DJMcNab <36049421+djmcnab@users.noreply.github.com>
@matklad
Copy link
Member Author

matklad commented Jan 1, 2019

Only cron job is left. Here are the docs: https://docs.travis-ci.com/user/cron-jobs/.

@memoryruins
Copy link
Contributor

Left it running overnight; no crashes yet :)

#6248625        REDUCE cov: 28690 ft: 163494 corp: 4204/91Kb lim: 110 exec/s: 234 rss: 605Mb L: 9/108 MS: 1 EraseBytes-
#6251047        REDUCE cov: 28690 ft: 163494 corp: 4204/91Kb lim: 110 exec/s: 234 rss: 605Mb L: 67/108 MS: 2 ShuffleBytes-EraseBytes-
#6251189        REDUCE cov: 28690 ft: 163494 corp: 4204/91Kb lim: 110 exec/s: 234 rss: 605Mb L: 37/108 MS: 2 ChangeBinInt-EraseBytes-
#6251406        NEW    cov: 28690 ft: 163503 corp: 4205/91Kb lim: 110 exec/s: 234 rss: 605Mb L: 109/109 MS: 2 InsertByte-CopyPart-
#6253052        REDUCE cov: 28690 ft: 163503 corp: 4205/91Kb lim: 110 exec/s: 234 rss: 605Mb L: 28/109 MS: 1 EraseBytes- 

@skade
Copy link
Contributor

skade commented Apr 14, 2020

Hm, travis cron jobs are not really meant for long-running tasks. Is there a maximum time we want to run this?

I would recommend a service such as oss-fuzz:
https://google.github.io/oss-fuzz/
https://google.github.io/oss-fuzz/getting-started/accepting-new-projects/

We'd need to apply, though.

@matklad
Copy link
Member Author

matklad commented Apr 15, 2020

Is there a maximum time we want to run this?

Yeah, that's what I had in mind: limit fuzzing time by, say, five minutes every nigh. We've recently added some extra time consuming checks on nightly.

We'd need to apply, though.

My gut feeling is that, at this stage, fuzzing would add relatively little benefit. So I'd rather avoid integrating with services besides the ones we already have.

@matklad
Copy link
Member Author

matklad commented Jul 15, 2020

we have some fuzzing tests set up, we don't have automated fuzzing on CI, but, given the number of known bugs, I don't think it's worthwhile to invest time into this at this point.

@matthiaskrgr
Copy link
Member

Just for the record, I have fed rust-lang/glacier and rust-lang/rust + around 30% of random files contained in the history of rust-lang/rust into rust-analyzer highlight, and filtered out a couple of crashes:

#16288
#16287
#16286
#16284
#16283
#16282
#16281
#16280
#16278

I can do mutation-based fuzzing based on randomly mutated (using tree-splicer) rust files using https://github.com/matthiaskrgr/icemaker/ if anyone is interested :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E-medium fun A technically challenging issue with high impact
Projects
None yet
Development

No branches or pull requests

7 participants