Skip to content

Conversation

@wangyum
Copy link
Member

@wangyum wangyum commented Sep 22, 2019

What changes were proposed in this pull request?

This PR reduce shuffle partitions from 200 to 4 in SQLQueryTestSuite to reduce testing time.

Why are the changes needed?

Reduce testing time.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manually tested in my local:
Before:

...
[info] - subquery/in-subquery/in-joins.sql (6 minutes, 19 seconds)
[info] - subquery/in-subquery/not-in-joins.sql (2 minutes, 17 seconds)
[info] - subquery/scalar-subquery/scalar-subquery-predicate.sql (45 seconds, 763 milliseconds)
...
Run completed in 1 hour, 22 minutes.

After:

...
[info] - subquery/in-subquery/in-joins.sql (1 minute, 12 seconds)
[info] - subquery/in-subquery/not-in-joins.sql (27 seconds, 541 milliseconds)
[info] - subquery/scalar-subquery/scalar-subquery-predicate.sql (17 seconds, 360 milliseconds)
...
Run completed in 47 minutes.

@wangyum
Copy link
Member Author

wangyum commented Sep 22, 2019

cc @HyukjinKwon @dongjoon-hyun

@HyukjinKwon
Copy link
Member

cc @gatorsmile and @cloud-fan too

@SparkQA
Copy link

SparkQA commented Sep 22, 2019

Test build #111153 has finished for PR 25891 at commit 3dc0124.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@HyukjinKwon
Copy link
Member

@wangyum if tests fail and are tricky to fix, let's just only fix SQLQueryTestSuite for now since that takes longest time.

@SparkQA
Copy link

SparkQA commented Sep 22, 2019

Test build #111158 has finished for PR 25891 at commit 3dc0124.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 22, 2019

Test build #111159 has finished for PR 25891 at commit 6ec9761.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

-- !query 4 output
1 10 val3b 8 NULL
1 10 val1b 8 16
1 10 val3b 8 NULL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ur, do we really need this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

[info] - subquery/in-subquery/in-joins.sql *** FAILED *** (15 seconds, 359 milliseconds)
[info]   subquery/in-subquery/in-joins.sql
[info]   Expected "1    10      val[3b  8       NULL
[info]   1      10      val1b   8       16]
[info]   1      10      val3a   6       12
[info]   1      8...", but got "1       10      val[1b  8       16
[info]   1      10      val3b   8       NULL]
[info]   1      10      val3a   6       12
[info]   1      8..." Result did not match for query #4
[info]   SELECT    Count(DISTINCT(t1a)),
[info]             t1b,
[info]             t3a,
[info]             t3b,
[info]             t3c
[info]   FROM      t1 natural left JOIN t3
[info]   WHERE     t1a IN
[info]             (
[info]                    SELECT t2a
[info]                    FROM   t2
[info]                    WHERE t1d = t2d)
[info]   AND       t1b > t3b
[info]   GROUP BY  t1a,
[info]             t1b,
[info]             t3a,
[info]             t3b,
[info]             t3c
[info]   ORDER BY  t1a DESC, t3b DESC (SQLQueryTestSuite.scala:383)
[info]   org.scalatest.exceptions.TestFailedException:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. It seems that we are hitting the corner case because the query has a sort on a subset of columns.

    def isSorted(plan: LogicalPlan): Boolean = plan match {
      case _: Join | _: Aggregate | _: Generate | _: Sample | _: Distinct => false
      case _: DescribeCommandBase
          | _: DescribeColumnCommand
          | _: DescribeTableStatement
          | _: DescribeColumnStatement => true
      case PhysicalOperation(_, _, Sort(_, true, _)) => true
      case _ => plan.children.iterator.exists(isSorted)
    }


override def sparkConf: SparkConf = super.sparkConf
// Reduce shuffle partitions to reduce testing time.
.set(SQLConf.SHUFFLE_PARTITIONS, 5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Python UDF test, this seems to increase from 4 to 5. Did I understand correctly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually why don't we try 4 @wangyum?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for 4.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I'll try to set it to 4. This is because it is set to 5 in two places:

/**
* A map used to store all confs that need to be overridden in sql/core unit tests.
*/
val overrideConfs: Map[String, String] =
Map(
// Fewer shuffle partitions to speed up testing.
SQLConf.SHUFFLE_PARTITIONS.key -> "5")

/**
* A map used to store all confs that need to be overridden in sql/hive unit tests.
*/
val overrideConfs: Map[String, String] =
Map(
// Fewer shuffle partitions to speed up testing.
SQLConf.SHUFFLE_PARTITIONS.key -> "5"
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=4 has another issue: --SET spark.sql.autoBroadcastJoinThreshold=10485760 will success, but --SET spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=true will failed:

22:31:31.233 ERROR org.apache.spark.sql.SQLQueryTestSuite: Error using configs: spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=true,spark.sql.codegen.wholeStage=true,spark.sql.codegen.factoryMode=CODEGEN_ONLY
[info] - subquery/in-subquery/not-in-joins.sql *** FAILED *** (32 seconds, 609 milliseconds)
[info]   subquery/in-subquery/not-in-joins.sql
[info]   Expected "1    16      12      [21
[info]   1      16      12      10]
[info]   1      10      NULL    12
[info]   1      6       8       ...", but got "1        16      12      [10
[info]   1      16      12      21]
[info]   1      10      NULL    12
[info]   1      6       8       ..." Result did not match for query #6
[info]   SELECT Count(DISTINCT( t1a )),
[info]          t1b,
[info]          t1c,
[info]          t1d
[info]   FROM   t1
[info]   WHERE  t1a NOT IN (SELECT t2a
[info]                      FROM   t2
[info]                      JOIN t1
[info]                      WHERE  t2b <> t1b)
[info]   GROUP  BY t1b,
[info]             t1c,
[info]             t1d
[info]   HAVING t1d NOT IN (SELECT t2d
[info]                      FROM   t2
[info]                      WHERE  t1d = t2d)
[info]   ORDER BY t1b DESC (SQLQueryTestSuite.scala:383)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an ORDER BY to make the query output deterministic?

@dongjoon-hyun
Copy link
Member

Thank you for taking care of this. In general, I agree with the idea. However, I left two comments about the result changes and the side-effect on Python UDF tests. It seems that we need to revise this PR more to achieve the original idea correctly.

@dongjoon-hyun
Copy link
Member

I updated the PR description too (from 5 to 4).

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111172 has finished for PR 25891 at commit 55004b9.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

LGTM if tests pass

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111185 has finished for PR 25891 at commit b4f2d19.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too if tests pass

@wangyum
Copy link
Member Author

wangyum commented Sep 23, 2019

It should be pass after this commit:

[info] Run completed in 34 minutes, 35 seconds.
[info] Total number of tests run: 211
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 211, failed 0, canceled 0, ignored 1, pending 0
[info] All tests passed.

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111202 has finished for PR 25891 at commit b4f2d19.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111207 has finished for PR 25891 at commit ad6bee7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111214 has finished for PR 25891 at commit df51b69.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 23, 2019

Test build #111218 has finished for PR 25891 at commit df51b69.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum wangyum closed this in 0c40b94 Sep 23, 2019
@wangyum
Copy link
Member Author

wangyum commented Sep 23, 2019

Thank you all!

@wangyum
Copy link
Member Author

wangyum commented Sep 23, 2019

Merged to master.

@dongjoon-hyun
Copy link
Member

+1, Late LGTM. ORDER BY looks a correct way for the future. Thank you all!

@dongjoon-hyun
Copy link
Member

BTW, @wangyum . Can we have this in branch-2.4?

@maropu
Copy link
Member

maropu commented Sep 26, 2019

Reducing the time looks super nice...

wangyum added a commit that referenced this pull request Sep 26, 2019
…estSuite

### What changes were proposed in this pull request?
This PR backport #25891 to `branch-2.4`.

### Why are the changes needed?
Reduce testing time.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?

Manually tested in my local:
Before:
```
...
[info] - subquery/in-subquery/in-joins.sql (6 minutes, 19 seconds)
[info] - subquery/in-subquery/not-in-joins.sql (2 minutes, 17 seconds)
[info] - subquery/scalar-subquery/scalar-subquery-predicate.sql (45 seconds, 763 milliseconds)
...
Run completed in 1 hour, 22 minutes.
```
After:
```
...
[info] - subquery/in-subquery/in-joins.sql (1 minute, 12 seconds)
[info] - subquery/in-subquery/not-in-joins.sql (27 seconds, 541 milliseconds)
[info] - subquery/scalar-subquery/scalar-subquery-predicate.sql (17 seconds, 360 milliseconds)
...
Run completed in 47 minutes.

Closes #25938 from wangyum/SPARK-29203-branch-2.4.

Authored-by: Yuming Wang <yumwang@ebay.com>
Signed-off-by: Yuming Wang <wgyumg@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants