Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement Spark-compatible CAST from floating-point/double to decimal #384

Merged
merged 11 commits into from
May 9, 2024

Conversation

vaibhawvipul
Copy link
Contributor

@vaibhawvipul vaibhawvipul commented May 6, 2024

Which issue does this PR close?

Closes #371

Rationale for this change

Improve compatibility with spark

What changes are included in this PR?

Add custom implementation of CAST from double to timestamp to handle eval_modes if error in casting.

How are these changes tested?

CometCastSuite test case passes.

@andygrove
Copy link
Member

This is looking good @vaibhawvipul.

@vaibhawvipul vaibhawvipul marked this pull request as ready for review May 7, 2024 06:55
@andygrove
Copy link
Member

andygrove commented May 7, 2024

@vaibhawvipul Do you plan on supporting FloatType -> DecimalType as well as DoubleType -> DecimalType in this PR?

@vaibhawvipul
Copy link
Contributor Author

@vaibhawvipul Do you plan on supporting FloatType -> DecimalType as well as DoubleType -> DecimalType in this PR?

@andygrove - This PR now supports both FloatType/DoubleType -> DecimalType.

@vaibhawvipul vaibhawvipul requested a review from andygrove May 7, 2024 18:39
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 34.00%. Comparing base (9ab6c75) to head (a8dcdea).
Report is 60 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main     #384      +/-   ##
============================================
+ Coverage     33.47%   34.00%   +0.53%     
- Complexity      795      857      +62     
============================================
  Files           110      116       +6     
  Lines         37533    38552    +1019     
  Branches       8215     8513     +298     
============================================
+ Hits          12563    13110     +547     
- Misses        22322    22690     +368     
- Partials       2648     2752     +104     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@vaibhawvipul vaibhawvipul changed the title feat: Implement Spark-compatible CAST from floating-point to decimal feat: Implement Spark-compatible CAST from floating-point/double to decimal May 8, 2024
Comment on lines +963 to +973
if (sparkMessage.contains("cannot be represented as")) {
assert(cometMessage.contains("cannot be represented as"))
} else {
assert(cometMessageModified == sparkMessage)
}
} else {
// for Spark 3.2 we just make sure we are seeing a similar type of error
if (sparkMessage.contains("causes overflow")) {
assert(cometMessage.contains("due to an overflow"))
} else if (sparkMessage.contains("cannot be represented as")) {
assert(cometMessage.contains("cannot be represented as"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that the approach we have for handling error message comparison for Spark 3.2 and 3.3 needs some rethinking. I am going to make a proposal to improve this.

#402

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, that would be great.

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you @vaibhawvipul

@vaibhawvipul vaibhawvipul requested a review from viirya May 8, 2024 18:42
Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @vaibhawvipul

Comment on lines +448 to +459
if Decimal128Type::validate_decimal_precision(v, precision).is_err() {
if eval_mode == EvalMode::Ansi {
return Err(CometError::NumericValueOutOfRange {
value: input_value.to_string(),
precision,
scale,
});
} else {
cast_array.append_null();
}
}
cast_array.append_value(v);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix. As it is not detected by the test, maybe we can add a test for follow up.

@andygrove andygrove merged commit 56f57f4 into apache:main May 9, 2024
40 checks passed
@vaibhawvipul vaibhawvipul deleted the issue-371 branch May 11, 2024 11:25
himadripal pushed a commit to himadripal/datafusion-comet that referenced this pull request Sep 7, 2024
…ecimal (apache#384)

* support NumericValueOutOfRange error

* adding ansi checks and code refactor

* fmt fixes

* Remove redundant comment

* bug fix

* adding cast for float32 as well

* fix test case for spark 3.2 and 3.3

* return error only in ansi mode
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Spark-compatible CAST from floating-point to decimal
5 participants