Skip to content

Conversation

@sarutak
Copy link
Member

@sarutak sarutak commented Oct 7, 2014

If we write the filter which is always FALSE like

SELECT * from person WHERE FALSE;

200 tasks will run. I think, 1 task is enough.

And current optimizer cannot optimize the case NOT is duplicated like

SELECT * from person WHERE NOT ( NOT (age > 30));

The filter rule above should be simplified

Added optimization rule related to bool expression.
@sarutak sarutak changed the title [SPARK-3831] Filter rule Improvement and bool expression optimization. [SPARK-3831] [SQL] Filter rule Improvement and bool expression optimization. Oct 7, 2014
@SparkQA
Copy link

SparkQA commented Oct 7, 2014

QA tests have started for PR 2692 at commit 8ea872b.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 7, 2014

QA tests have finished for PR 2692 at commit 8ea872b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21370/Test FAILed.

@yhuai
Copy link
Contributor

yhuai commented Oct 7, 2014

@sarutak LGTM. Can you take a look at the failing test?
The log is

[info] - NOT (i < 88) *** FAILED ***
[info]   2 did not equal 10 Wrong number of read batches (PartitionBatchPruningSuite.scala:91)

Seems we need to update the test suite since with your change, we can handle this predicate when doing batch pruning for cached tables. Also, it will be good to add another case involving NOT into "unsupported predicate" if possible.

@sarutak
Copy link
Member Author

sarutak commented Oct 7, 2014

@yhuai Thanks picking this PR up and for your comment!
I'll try soon.

Added an unsupported NOT predicate test case.
@SparkQA
Copy link

SparkQA commented Oct 7, 2014

QA tests have started for PR 2692 at commit a11b9f3.

  • This patch merges cleanly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

checkBatchPruning(s"NOT (i in (${(1 to 30).mkString(",")}))", 31 to 100, 5, 10)

For this case, we will read 4 partitions including 7 batches when we can support it.

@SparkQA
Copy link

SparkQA commented Oct 7, 2014

QA tests have finished for PR 2692 at commit a11b9f3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21372/Test PASSed.

@sarutak
Copy link
Member Author

sarutak commented Oct 7, 2014

@yhuai Thanks, it makes sense.

@SparkQA
Copy link

SparkQA commented Oct 7, 2014

QA tests have started for PR 2692 at commit 23c750c.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 7, 2014

QA tests have finished for PR 2692 at commit 23c750c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21381/Test PASSed.

@yhuai
Copy link
Contributor

yhuai commented Oct 7, 2014

LGTM

cc @marmbrus.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21470/Test FAILed.

@sarutak
Copy link
Member Author

sarutak commented Oct 8, 2014

retest this please.

@SparkQA
Copy link

SparkQA commented Oct 8, 2014

QA tests have started for PR 2692 at commit 25f3e20.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 8, 2014

QA tests have finished for PR 2692 at commit 25f3e20.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21473/Test PASSed.

@marmbrus
Copy link
Contributor

marmbrus commented Oct 9, 2014

Thanks! I've merged to master.

@asfgit asfgit closed this in a85f24a Oct 9, 2014
@sarutak sarutak deleted the SPARK-3831 branch April 11, 2015 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants