Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve speed of datafusion::fuzz fuzz_cases::pruning::test_fuzz_utf8 test #13946

Closed
alamb opened this issue Dec 30, 2024 · 3 comments · Fixed by #13947
Closed

Improve speed of datafusion::fuzz fuzz_cases::pruning::test_fuzz_utf8 test #13946

alamb opened this issue Dec 30, 2024 · 3 comments · Fixed by #13947

Comments

@alamb
Copy link
Contributor

alamb commented Dec 30, 2024

After #12978 the test_fuzz_utf8 test takes over a minute to run on my machine

This ticket tracks improving the speed

          > Thanks @adriangb -- I think this PR is ready to go

One thing I noticed is that the fuzz test takes over a minute on my machine:

        SLOW [> 60.000s] datafusion::fuzz fuzz_cases::pruning::test_fuzz_utf8
        PASS [  65.772s] datafusion::fuzz fuzz_cases::pruning::test_fuzz_utf8
------------
     Summary [  72.749s] 47 tests run: 47 passed (1 slow), 0 skipped
andrewlamb@Mac:~/Software/datafusion$

Is there some way to make it faster? Maybe with multiple threads or crank down the number of things to teset?

Yeah this is what I was hinting at in #12978 (comment).

I'm happy to throw threads at it for a start, and restricting the search space might be necessary but I think requires a more careful eye to minimize how much valuable testing is discarded. The other thing that I think we can do is speed up the tests themselves, in particular minimizing unnecessary round trips to Parquet, but I'm not sure where the right places to hook in would be that still give us a realistic test but remove the need to re-parse the same data over and over again.

Originally posted by @adriangb in #12978 (comment)

@alamb
Copy link
Contributor Author

alamb commented Dec 30, 2024

I am going to give this a try

@adriangb
Copy link
Contributor

Amazing thanks Andrew!

@alamb
Copy link
Contributor Author

alamb commented Dec 30, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants