Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue: Port sqlsmith to rust for sql fuzzy testing #2571

Closed
33 of 34 tasks
neverchanje opened this issue May 16, 2022 · 6 comments
Closed
33 of 34 tasks

Tracking Issue: Port sqlsmith to rust for sql fuzzy testing #2571

neverchanje opened this issue May 16, 2022 · 6 comments
Assignees
Labels

Comments

@neverchanje
Copy link
Contributor

neverchanje commented May 16, 2022

SQLsmith is a random SQL query generator. It has found many bugs across the database industry.

The original Sqlsmith was licensed under gplv2 which is unfriendly to use. https://github.com/anse1/sqlsmith
CockroachDB has implemented a golang version of sqlsmith:

The remaining work of Sqlsmith:

Test suite:

RisingWave bugs reported:

Tracking issue for workarounds:

Other Notes:

  • Please run e2e tests locally when submitting PRs, it is not yet integrated into CI as it is still unstable.
  • Please prioritize stabilizing SqlSmith, so we can include in our e2e test suite. Will appreciate closing out bugs with SqlSmith and RisingWave first!

Thank you for your contributions!

@liurenjie1024
Copy link
Contributor

cc @sumittal

@liurenjie1024 liurenjie1024 added the component/test Test related issue. label May 16, 2022
@neverchanje neverchanje changed the title Port sqlsmith to rust for sql fuzzy testing Tracking Issue: Port sqlsmith to rust for sql fuzzy testing Jun 21, 2022
@kwannoel
Copy link
Contributor

kwannoel commented Jun 27, 2022

Wondering if something like Quickcheck's shrink would be useful here. Do generated queries get very complex and hard to debug?

Perhaps can be used for other fuzzers we eventually use too.

If a generated test case fails, Quickcheck's shrink takes the input and shrinks it as much as possible, to make it easier to debug.
In our case, we need to take the generated query and shrink it, producing a simplified query.
With the simplified query we re-run it to try and reproduce a crash.
If it crashes, we can then use simplified query to debug.

Some ways I can think of shrinking, off the top of my head:

  • Reducing no. of terms selected, projected etc...
  • Extracting subqueries.

@liurenjie1024
Copy link
Contributor

As part of fuzzy test, we should also generate invalid sql, can sqlsmit support that?

@neverchanje
Copy link
Contributor Author

As part of fuzzy test, we should also generate invalid sql, can sqlsmit support that?

Currently, no. Invalid sql is too random. If there's unlimited time, surely you can do that.

@neverchanje
Copy link
Contributor Author

Do generated queries get very complex and hard to debug?

No, actually. The cockroach version has parameters to control the overall complexity of the generated sql. It will limit the recursions (subqueries), select items, and tables in a join. So generally, we can let the sql adapt to our wishes.

@kwannoel
Copy link
Contributor

kwannoel commented Aug 16, 2022

Remaining refinements / workarounds / bugs are tracked in #3896 . Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants