[SPARK-18873][SQL][TEST] New test cases for scalar subquery (part 2 of 2) - scalar subquery in predicate context #16798

nsyca · 2017-02-04T01:16:46Z

What changes were proposed in this pull request?

This PR adds new test cases for scalar subquery in predicate context

How was this patch tested?

The test result is compared with the result run from another SQL engine (in this case is IBM DB2). If the result are equivalent, we assume the result is correct.

…rrect results ## What changes were proposed in this pull request? This patch fixes the incorrect results in the rule ResolveSubquery in Catalyst's Analysis phase. ## How was this patch tested? ./dev/run-tests a new unit test on the problematic pattern.

nsyca · 2017-02-04T01:30:07Z

Below are a modified version of the test cases to run on DB2 and the result from DB2, as a second source to compare to the result from Spark.
Modified test file to run on DB2
Result from DB2

nsyca · 2017-02-04T01:31:44Z

sql/core/src/test/resources/sql-tests/inputs/scalar-subquery.sql

-	       FROM   (SELECT   c1.cv, avg(c1.cv) avg
-		       FROM     c c1
-		       WHERE    c1.ck = p.pk
-                       GROUP BY c1.cv));


Merged the test cases with the new test cases and placed under the directory "scalar-subquery".

nsyca · 2017-02-04T01:33:12Z

@dilipbiswal Could you please cross-check the results from both sources?
@gatorsmile, @hvanhovell Could you please review?

dilipbiswal · 2017-02-04T01:43:33Z

.../test/resources/sql-tests/results/subquery/scalar-subquery/scalar-subquery-predicate.sql.out

+struct<t1a:string>
+-- !query 24 output
+val1b
+val1c


I have compared the result set matched with the result from DB2.

SparkQA · 2017-02-04T03:29:19Z

Test build #72337 has finished for PR 16798 at commit 092f2a5.

This patch fails SparkR unit tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-02-04T03:57:13Z

retest this please

SparkQA · 2017-02-04T06:18:52Z

Test build #72356 has finished for PR 16798 at commit 092f2a5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2017-02-05T02:36:02Z

...e/src/test/resources/sql-tests/inputs/subquery/scalar-subquery/scalar-subquery-predicate.sql

+               FROM     t2
+               WHERE    t2c = t1c
+               GROUP BY t2c)
+AND    t1b >= (SELECT   min(t2b)


Nit. I like this indentation.
For the other examples, AND seems to be aligned with t1b at line 190.

dongjoon-hyun · 2017-02-05T02:37:43Z

...e/src/test/resources/sql-tests/inputs/subquery/scalar-subquery/scalar-subquery-predicate.sql

+               FROM     t2
+               WHERE    t2c = t1c
+               GROUP BY t2c)
+UNION ALL


Can we have another test case for UNION here too?

@dongjoon-hyun Thank you for your comment. I have added another test case using UNION DISTINCT.

I would appreciate if you could share your insight on what you think the UNION test case will process differently from the UNION ALL test case with respect to the testing of scalar subquery.

Note that the correctness of the result of UNION DISTINCT can be inferred from applying the "uniqueness" operator on the result of the existing UNION test case.

SparkQA · 2017-02-05T23:13:51Z

Test build #72416 has finished for PR 16798 at commit 044d6a4.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

hvanhovell · 2017-02-15T16:31:12Z

LGTM - merging to master

…f 2) - scalar subquery in predicate context ## What changes were proposed in this pull request? This PR adds new test cases for scalar subquery in predicate context ## How was this patch tested? The test result is compared with the result run from another SQL engine (in this case is IBM DB2). If the result are equivalent, we assume the result is correct. Author: Nattavut Sutyanyong <nsy.can@gmail.com> Closes apache#16798 from nsyca/18873-2.

…ll up to Optimizer phase ## What changes were proposed in this pull request? Currently Analyzer as part of ResolveSubquery, pulls up the correlated predicates to its originating SubqueryExpression. The subquery plan is then transformed to remove the correlated predicates after they are moved up to the outer plan. In this PR, the task of pulling up correlated predicates is deferred to Optimizer. This is the initial work that will allow us to support the form of correlated subqueries that we don't support today. The design document from nsyca can be found in the following link : [DesignDoc](https://docs.google.com/document/d/1QDZ8JwU63RwGFS6KVF54Rjj9ZJyK33d49ZWbjFBaIgU/edit#) The brief description of code changes (hopefully to aid with code review) can be be found in the following link: [CodeChanges](https://docs.google.com/document/d/18mqjhL9V1An-tNta7aVE13HkALRZ5GZ24AATA-Vqqf0/edit#) ## How was this patch tested? The test case PRs were submitted earlier using. [16337](#16337) [16759](#16759) [16841](#16841) [16915](#16915) [16798](#16798) [16712](#16712) [16710](#16710) [16760](#16760) [16802](#16802) Author: Dilip Biswal <dbiswal@us.ibm.com> Closes #16954 from dilipbiswal/SPARK-18874.

nsyca added 17 commits July 29, 2016 17:43

New positive test cases

edca333

Fix unit test case failure

64184fd

blocking TABLESAMPLE

29f82b0

Fixing code styling

ac43ab4

Correcting Scala test style

631d396

One (last) attempt to correct the Scala style tests

7eb9b2d

Merge remote-tracking branch 'upstream/master'

1387cf5

Merge remote-tracking branch 'upstream/master'

3faa2d5

Merge remote-tracking branch 'upstream/master'

a308634

Merge remote-tracking branch 'upstream/master'

f1524b9

Merge remote-tracking branch 'upstream/master'

5c36dce

Merge remote-tracking branch 'upstream/master'

862b2b8

Merge remote-tracking branch 'upstream/master'

211e325

Merge remote-tracking branch 'upstream/master'

05119ef

new test file

092f2a5

nsyca commented Feb 4, 2017

View reviewed changes

dilipbiswal reviewed Feb 4, 2017

View reviewed changes

dongjoon-hyun reviewed Feb 5, 2017

View reviewed changes

gatorsmile mentioned this pull request Feb 5, 2017

[SPARK-15694] Implement ScriptTransformation in sql/core (part 1) #14702

Closed

Add UNION DISTINCT

044d6a4

asfgit closed this in 5ad10c5 Feb 15, 2017

dilipbiswal mentioned this pull request Feb 16, 2017

[SPARK-18874][SQL] First phase: Deferring the correlated predicate pull up to Optimizer phase #16954

Closed

nsyca deleted the 18873-2 branch March 14, 2017 21:08

[SPARK-18873][SQL][TEST] New test cases for scalar subquery (part 2 of 2) - scalar subquery in predicate context #16798

[SPARK-18873][SQL][TEST] New test cases for scalar subquery (part 2 of 2) - scalar subquery in predicate context #16798

Uh oh!

Conversation

nsyca commented Feb 4, 2017

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

nsyca commented Feb 4, 2017

Uh oh!

nsyca Feb 4, 2017

Choose a reason for hiding this comment

Uh oh!

nsyca commented Feb 4, 2017

Uh oh!

dilipbiswal Feb 4, 2017

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Feb 4, 2017

Uh oh!

gatorsmile commented Feb 4, 2017

Uh oh!

SparkQA commented Feb 4, 2017

Uh oh!

dongjoon-hyun Feb 5, 2017

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Feb 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nsyca Feb 5, 2017

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Feb 5, 2017

Uh oh!

hvanhovell commented Feb 15, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

dongjoon-hyun Feb 5, 2017 •

edited

Loading