Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-36926][SQL] Decimal average mistakenly overflow #34180

Closed
wants to merge 2 commits into from

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This bug was introduced by #33177

When checking overflow of the sum value in the average function, we should use the sumDataType instead of the input decimal type.

Why are the changes needed?

fix a regression

Does this PR introduce any user-facing change?

Yes, the result was wrong before this PR.

How was this patch tested?

a new test

@github-actions github-actions bot added the SQL label Oct 5, 2021
@cloud-fan
Copy link
Contributor Author

@@ -97,7 +97,7 @@ case class Average(
case d: DecimalType =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can hide this d then.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Copy link
Member

@gengliangwang gengliangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SparkQA
Copy link

SparkQA commented Oct 5, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48344/

@SparkQA
Copy link

SparkQA commented Oct 5, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48344/

@gengliangwang
Copy link
Member

gengliangwang commented Oct 5, 2021

Merging to master

@gengliangwang
Copy link
Member

@cloud-fan This PR can't be merged to branch-3.2 directly. Could you open a backport PR?

@SparkQA
Copy link

SparkQA commented Oct 5, 2021

Test build #143831 has finished for PR 34180 at commit 413f668.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mridulm
Copy link
Contributor

mridulm commented Oct 5, 2021

Thanks for fixing this quickly @cloud-fan !

cloud-fan added a commit to cloud-fan/spark that referenced this pull request Oct 6, 2021
This bug was introduced by apache#33177

When checking overflow of the sum value in the average function, we should use the `sumDataType` instead of the input decimal type.

fix a regression

Yes, the result was wrong before this PR.

a new test

Closes apache#34180 from cloud-fan/bug.

Lead-authored-by: Wenchen Fan <wenchen@databricks.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
gengliangwang pushed a commit that referenced this pull request Oct 6, 2021
backport #34180

### What changes were proposed in this pull request?

This bug was introduced by #33177

When checking overflow of the sum value in the average function, we should use the `sumDataType` instead of the input decimal type.

### Why are the changes needed?

fix a regression

### Does this PR introduce _any_ user-facing change?

Yes, the result was wrong before this PR.

### How was this patch tested?

a new test

Closes #34193 from cloud-fan/bug.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
sunchao pushed a commit to sunchao/spark that referenced this pull request Dec 8, 2021
backport apache#34180

### What changes were proposed in this pull request?

This bug was introduced by apache#33177

When checking overflow of the sum value in the average function, we should use the `sumDataType` instead of the input decimal type.

### Why are the changes needed?

fix a regression

### Does this PR introduce _any_ user-facing change?

Yes, the result was wrong before this PR.

### How was this patch tested?

a new test

Closes apache#34193 from cloud-fan/bug.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants