-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-3831] [SQL] Filter rule Improvement and bool expression optimization. #2692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Added optimization rule related to bool expression.
|
QA tests have started for PR 2692 at commit
|
|
QA tests have finished for PR 2692 at commit
|
|
Test FAILed. |
|
@sarutak LGTM. Can you take a look at the failing test? Seems we need to update the test suite since with your change, we can handle this predicate when doing batch pruning for cached tables. Also, it will be good to add another case involving |
|
@yhuai Thanks picking this PR up and for your comment! |
Added an unsupported NOT predicate test case.
|
QA tests have started for PR 2692 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
checkBatchPruning(s"NOT (i in (${(1 to 30).mkString(",")}))", 31 to 100, 5, 10)
For this case, we will read 4 partitions including 7 batches when we can support it.
|
QA tests have finished for PR 2692 at commit
|
|
Test PASSed. |
|
@yhuai Thanks, it makes sense. |
|
QA tests have started for PR 2692 at commit
|
|
QA tests have finished for PR 2692 at commit
|
|
Test PASSed. |
|
LGTM cc @marmbrus. |
|
Test FAILed. |
|
retest this please. |
|
QA tests have started for PR 2692 at commit
|
|
QA tests have finished for PR 2692 at commit
|
|
Test PASSed. |
|
Thanks! I've merged to master. |
If we write the filter which is always FALSE like
200 tasks will run. I think, 1 task is enough.
And current optimizer cannot optimize the case NOT is duplicated like
The filter rule above should be simplified