-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-9069] [SPARK-9264] [SQL] remove unlimited precision support for DecimalType #7605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #38139 has finished for PR 7605 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this? Declaring a zero-arg apply method is pretty error prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a public API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah damn. Maybe deprecate it for now? Take a look to see how many places use it. If possible, let's remove its usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is different from our offline discussion, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will break some tests, so I'd like to do it separately, https://issues.apache.org/jira/browse/SPARK-9281?filter=-2.
|
@davies can you point out which part of the code changes require more careful review? Most of the changes are just replacing Unlimited with Maximum. |
|
Test build #38161 has finished for PR 7605 at commit
|
|
Test build #38168 has finished for PR 7605 at commit
|
|
Test build #38191 has finished for PR 7605 at commit
|
Conflicts: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeFixedWidthAggregationMapSuite.scala sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala
|
@rxin I think most of changes should be pay attention to, Unlimited should be replaced by Default or Maximum (they are different). Especially how we change the precision and scale for expressions (in HiveTypeCoercion and aggregation). |
|
Test build #38234 has finished for PR 7605 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we take Double as the higher type for Decimal?
|
Test build #38266 has finished for PR 7605 at commit
|
|
Test build #38274 has finished for PR 7605 at commit
|
|
Test build #38276 has finished for PR 7605 at commit
|
|
@rxin Could you take another round on this? especially for HiveTypeCoercion |
|
Test build #1190 has finished for PR 7605 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should update the documentation for findTightestCommonTypeOfTwo, which still says we don't do anything w.r.t. decimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how come you added this, but not for fractional type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok see my comment below -- i think you want to handle them here as well.
|
I find it confusing SYSTEM_DEFAULT vs USER_DEFAULT. It is unclear what the semantics are. I think what you had previously - Default and Max was easier to understand. |
|
@davies since this is a larger patch, I'm going to merge this first. |
|
@rxin Actually (38, 18) is not the maximum one, (38, 38) may be. (10, 0) is the default one when user define a schema, (38, 18) is the one when we infer the type from BigDecimal, or other cases that we use it as default one internally, because it has balanced precision on range (left side) and scale (right side). The names is borrowed from Hive, I found them make more sense than DEFAULT and MAXIMUM. |
Address comments for #7605 cc rxin Author: Davies Liu <davies@databricks.com> Closes #7634 from davies/decimal_unlimited2 and squashes the following commits: b2d8b0d [Davies Liu] add doc and test for DecimalType.isWiderThan 65b251c [Davies Liu] fix test 6a91f32 [Davies Liu] fix style ca9c973 [Davies Liu] address comments
Romove Decimal.Unlimited (change to support precision up to 38, to match with Hive and other databases).
In order to keep backward source compatibility, Decimal.Unlimited is still there, but change to Decimal(38, 18).
If no precision and scale is provide, it's Decimal(10, 0) as before.