-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement Spark-compatible CAST from floating-point/double to decimal #384
Conversation
This is looking good @vaibhawvipul. |
@vaibhawvipul Do you plan on supporting |
@andygrove - This PR now supports both FloatType/DoubleType -> DecimalType. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #384 +/- ##
============================================
+ Coverage 33.47% 34.00% +0.53%
- Complexity 795 857 +62
============================================
Files 110 116 +6
Lines 37533 38552 +1019
Branches 8215 8513 +298
============================================
+ Hits 12563 13110 +547
- Misses 22322 22690 +368
- Partials 2648 2752 +104 ☔ View full report in Codecov by Sentry. |
if (sparkMessage.contains("cannot be represented as")) { | ||
assert(cometMessage.contains("cannot be represented as")) | ||
} else { | ||
assert(cometMessageModified == sparkMessage) | ||
} | ||
} else { | ||
// for Spark 3.2 we just make sure we are seeing a similar type of error | ||
if (sparkMessage.contains("causes overflow")) { | ||
assert(cometMessage.contains("due to an overflow")) | ||
} else if (sparkMessage.contains("cannot be represented as")) { | ||
assert(cometMessage.contains("cannot be represented as")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see that the approach we have for handling error message comparison for Spark 3.2 and 3.3 needs some rethinking. I am going to make a proposal to improve this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, that would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you @vaibhawvipul
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @vaibhawvipul
if Decimal128Type::validate_decimal_precision(v, precision).is_err() { | ||
if eval_mode == EvalMode::Ansi { | ||
return Err(CometError::NumericValueOutOfRange { | ||
value: input_value.to_string(), | ||
precision, | ||
scale, | ||
}); | ||
} else { | ||
cast_array.append_null(); | ||
} | ||
} | ||
cast_array.append_value(v); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the fix. As it is not detected by the test, maybe we can add a test for follow up.
…ecimal (apache#384) * support NumericValueOutOfRange error * adding ansi checks and code refactor * fmt fixes * Remove redundant comment * bug fix * adding cast for float32 as well * fix test case for spark 3.2 and 3.3 * return error only in ansi mode
Which issue does this PR close?
Closes #371
Rationale for this change
Improve compatibility with spark
What changes are included in this PR?
Add custom implementation of CAST from double to timestamp to handle eval_modes if error in casting.
How are these changes tested?
CometCastSuite test case passes.