Skip to content

Conversation

@gengliangwang
Copy link
Member

What changes were proposed in this pull request?

Similar with #37313, currently, when arithmetic overflow errors happen under ANSI mode, the error messages are like

[ARITHMETIC_OVERFLOW] long overflow. Use 'try_multiply' to tolerate overflow and return NULL instead. If necessary set spark.sql.ansi.enabled to "false" 

The "(except for ANSI interval type)" part is confusing. We should remove it for the numeric arithmetic operations and have a new error class for the interval division error: INTERVAL_ARITHMETIC_OVERFLOW

Why are the changes needed?

For better error messages

Does this PR introduce any user-facing change?

Yes, Use different error classes for arithmetic overflows of numeric/interval.. After changes, the error messages are simpler and more clear.

How was this patch tested?

UT

f
} catch {
case e: ArithmeticException =>
throw QueryExecutionErrors.intervalArithmeticOverflowError(e.getMessage, hint)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we create sub-error classes per every op instead of using of e.getMessage from Java exception? cc @srielau @cloud-fan

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without knowing the exact existing error output I see three options:

  • Print out the expression causing the overflow (or the expression and the values) as a parameter: "X + Y" with arguments: [100, 200]

  • Naming the operation (and values): operation "+" with values [100, 200]

  • Sublcasses:
    ARITHMETIC_OVERFLOW.ADDITION: .... values [100, 200].

    The values would be helpful, but not always be readily available without lots of rework... Also someone is bound tyo complain about security....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this one, I think we can add query context for the error so that users can understand the cause.

@gengliangwang
Copy link
Member Author

@MaxGekk @srielau I am moving forward to merge this error message improvement. We can discuss about the further improvement later.
@cloud-fan thanks for the review

@gengliangwang
Copy link
Member Author

The master branch failed to compile after merging this PR: https://github.com/apache/spark/runs/7616787860?check_suite_focus=true
I am reverting it to unblock the development of other contributors.

@dongjoon-hyun
Copy link
Member

Thank you for swift recovering master branch, @gengliangwang .

gengliangwang added a commit to gengliangwang/spark that referenced this pull request Aug 2, 2022
…rithmetic overflow

### What changes were proposed in this pull request?

Similar with apache#37313, currently, when  arithmetic overflow errors happen under ANSI mode, the error messages are like
```
[ARITHMETIC_OVERFLOW] long overflow. Use 'try_multiply' to tolerate overflow and return NULL instead. If necessary set spark.sql.ansi.enabled to "false"
```

The "(except for ANSI interval type)" part is confusing. We should remove it for the numeric arithmetic operations and have a new error class for the interval division error: `INTERVAL_ARITHMETIC_OVERFLOW`

### Why are the changes needed?

For better error messages

### Does this PR introduce _any_ user-facing change?

Yes, Use different error classes for arithmetic overflows of numeric/interval.. After changes, the error messages are simpler and more clear.

### How was this patch tested?

UT

Closes apache#37337 from gengliangwang/improveOverflowMsg.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants