[Do not merge][SPARK-32018][SQL][3.0][FOLLOWUP] Fix a test failure in DataFrameSuite #29448

gengliangwang · 2020-08-17T04:34:49Z

What changes were proposed in this pull request?

Fix a test failure in DataFrameSuite introduced by #29404

Why are the changes needed?

Fix test failure

Does this PR introduce any user-facing change?

No

How was this patch tested?

Unit test

gengliangwang · 2020-08-17T04:35:08Z

cc @maropu @cloud-fan

cloud-fan · 2020-08-17T06:15:51Z

sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala

-          assert(e.getCause.getClass.equals(classOf[ArithmeticException]))
-          assert(e.getCause.getMessage.contains("cannot be represented as Decimal"))
+        val e = intercept[SparkException] {
+          structDf.collect


Can we add a comment to say that we have to fail overflow with non-ansi mode as well, and link the related JIRA tickets?

gengliangwang · 2020-08-17T06:42:44Z

I am now thinking about reverting the previous commits.

SparkQA · 2020-08-17T07:05:01Z

Test build #127497 has finished for PR 29448 at commit 69278b2.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2020-08-17T07:54:48Z

Actually this is more tricky than I thought.

In 3.0/2.4 without the unsafe row bug fix:

for hash aggregate with GROUP BY (so that we need a binary hash map), the query fails as soon as the overflow happens, due to the unsafe row bug.
for hash aggregate without GROUP BY, or sort aggregate, the sum value is actually stored in a Decimal object which can hold overflowed value.

(2) is very tricky:

If the overflow happens in the final aggregate, the final CheckOverflow operator can give us the correct result.
If the overflow happens in the partial aggregate, it produces null, and the final aggregate treats null as 0 which indicates empty inputs, and the wrong result happens.

The failed test will not work even if we revert the commit, if we change input DataFrame partition number to 1, to trigger overflow in partial aggregate.

To give a summary for 3.0/2.4:

for hash aggregate with GROUP BY, we always fail for overflow, even under non-ansi mode. This is not ideal but also not a serious bug.
for hash aggregate without GROUP BY, or sort aggregate, Spark returns the wrong result if overflow happens in partial aggregate, but is fine if overflow happens in final aggregate.

That said, #29404 introduced breaking changes to (2), as it always fails for overflow. Let's revert it.

For the unsafe row bug fix #29125, it's important as unsafe row binary is used to check equality in many places (join key, grouping key, window partition key, etc.), but it also makes (1) worse as Spark may return the wrong result. We can simply revert it as well, or we re-apply #29404 only to hash aggregate with GROUP BY to avoid breaking changes (can be complex if we consider distinct aggregate).

cloud-fan · 2020-08-17T07:57:07Z

@maropu @viirya @dongjoon-hyun

fix DataFrameSuite

69278b2

probot-autolabeler bot added the SQL label Aug 17, 2020

gengliangwang mentioned this pull request Aug 17, 2020

[SPARK-32018][SQL][FollowUp][3.0] Throw exception on decimal value overflow of sum aggregation #29404

Closed

cloud-fan reviewed Aug 17, 2020

View reviewed changes

Ngone51 mentioned this pull request Aug 17, 2020

[3.0][SPARK-32518][CORE] CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources #29395

Closed

gengliangwang changed the title ~~[SPARK-32018][SQL][3.0][FOLLOWUP] Fix a test failure in DataFrameSuite~~ [Do not merge][SPARK-32018][SQL][3.0][FOLLOWUP] Fix a test failure in DataFrameSuite Aug 17, 2020

gengliangwang closed this Aug 17, 2020

This was referenced Aug 17, 2020

[3.0][SQL] Revert SPARK-32018 #29450

Closed

[SPARK-32018][SQL][2.4] UnsafeRow.setDecimal should set null with overflowed value #29141

Closed

cloud-fan mentioned this pull request Oct 8, 2020

[SPARK-28067][SPARK-32018] Fix decimal overflow issues #29026

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Do not merge][SPARK-32018][SQL][3.0][FOLLOWUP] Fix a test failure in DataFrameSuite #29448

[Do not merge][SPARK-32018][SQL][3.0][FOLLOWUP] Fix a test failure in DataFrameSuite #29448

Uh oh!

gengliangwang commented Aug 17, 2020

Uh oh!

gengliangwang commented Aug 17, 2020

Uh oh!

cloud-fan Aug 17, 2020

Uh oh!

gengliangwang commented Aug 17, 2020

Uh oh!

SparkQA commented Aug 17, 2020

Uh oh!

cloud-fan commented Aug 17, 2020 •

edited

Loading

Uh oh!

cloud-fan commented Aug 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Do not merge][SPARK-32018][SQL][3.0][FOLLOWUP] Fix a test failure in DataFrameSuite #29448

[Do not merge][SPARK-32018][SQL][3.0][FOLLOWUP] Fix a test failure in DataFrameSuite #29448

Uh oh!

Conversation

gengliangwang commented Aug 17, 2020

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gengliangwang commented Aug 17, 2020

Uh oh!

cloud-fan Aug 17, 2020

Choose a reason for hiding this comment

Uh oh!

gengliangwang commented Aug 17, 2020

Uh oh!

SparkQA commented Aug 17, 2020

Uh oh!

cloud-fan commented Aug 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloud-fan commented Aug 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cloud-fan commented Aug 17, 2020 •

edited

Loading