Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for fuzz tests #3174

Open
10 of 16 tasks
WenyXu opened this issue Jan 16, 2024 · 1 comment
Open
10 of 16 tasks

Tracking issue for fuzz tests #3174

WenyXu opened this issue Jan 16, 2024 · 1 comment
Assignees

Comments

@WenyXu
Copy link
Member

WenyXu commented Jan 16, 2024

What problem does the new feature solve?

Introduce the fuzz tests and related utils

What does the feature do?

The utils are focused on randomly generated sql input and verify our database's output. it should be easily integrated with other integration tests(e.g., After fuzz tests, the region migration should still be available. Or the DB, after recovering from the chaos failures, the fuzz tests should be passed)

Other engines/scenarios

Implementation challenges

No response

@PragmaTwice
Copy link

PragmaTwice commented Apr 21, 2024

Upon observing GrepTime uncovering code issues through fuzzing, I am intrigued. With some prior experience in fuzzing, I have been contemplating integrating fuzzing tests into Kvrocks. Here are my inquiries:

  • Fuzzing essentially tackles a vast search problem. Many fuzzers pinpoint code issues by randomly generating inputs that meet specific criteria, such as csmith and sqlsmith. However, not all randomly generated inputs are necessarily meaningful; some may hold more significance and warrant further exploration beforehand. Consequently, certain fuzzing tools steer their input generation process using metrics like changes in code coverage: the notion being that if a newly created input can traverse previously unexecuted code segments, it likely holds greater value. This form of fuzzing is commonly known as coverage-guided mutation-based fuzzing, exemplified by afl++ and libfuzzer. Will GrepTime adopt this approach?

  • Another focal point of fuzzing involves identifying dependable test oracles. Numerous fuzzer tools employ differential testing: for instance, supplying inputs separately to MySQL and MariaDB then contrasting the outcome disparities between them. Nonetheless, an anticipated challenge lies in variations across different DBMS implementations potentially yielding numerous false positives (even among MySQL and databases asserting compatibility with MySQL). Addressing this issue might necessitate substantial software engineering efforts to alleviate or discover innovative workarounds like sqlancer. What strategy does GrepTime intend to pursue here?

  • A fuzzer could unearth a multitude of errors; however, distinguishing genuine findings from false positives within extensive fuzzer outputs poses a challenge. Moreover, many errors identified by the fuzzer may stem from the same root cause—manually sifting through tens of thousands of outputs to pinpoint dozens of distinct bugs could prove arduous. At this juncture, we might require additional tools for filtering and simplifying lengthy SQL inputs produced by these utilities. Does GrepTime harbor any distinctive approaches to tackle this predicament?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants