Skip to content

Conversation

@davies
Copy link
Contributor

@davies davies commented Jul 23, 2015

Romove Decimal.Unlimited (change to support precision up to 38, to match with Hive and other databases).

In order to keep backward source compatibility, Decimal.Unlimited is still there, but change to Decimal(38, 18).

If no precision and scale is provide, it's Decimal(10, 0) as before.

@davies davies changed the title [SQL] [WIP] remove Decimal.Unlimited [SQL] [WIP] remove unlimited precision support for DecimalType Jul 23, 2015
@SparkQA
Copy link

SparkQA commented Jul 23, 2015

Test build #38139 has finished for PR 7605 at commit c9c7c78.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class UnresolvedFunction(
    • case class Average(child: Expression) extends AlgebraicAggregate
    • case class Count(child: Expression) extends AlgebraicAggregate
    • case class First(child: Expression) extends AlgebraicAggregate
    • case class Last(child: Expression) extends AlgebraicAggregate
    • case class Max(child: Expression) extends AlgebraicAggregate
    • case class Min(child: Expression) extends AlgebraicAggregate
    • case class Sum(child: Expression) extends AlgebraicAggregate
    • abstract class AlgebraicAggregate extends AggregateFunction2 with Serializable
    • implicit class RichAttribute(a: AttributeReference)
    • trait AggregateExpression1 extends AggregateExpression
    • trait PartialAggregate1 extends AggregateExpression1
    • case class Min(child: Expression) extends UnaryExpression with PartialAggregate1
    • case class MinFunction(expr: Expression, base: AggregateExpression1) extends AggregateFunction1
    • case class Max(child: Expression) extends UnaryExpression with PartialAggregate1
    • case class MaxFunction(expr: Expression, base: AggregateExpression1) extends AggregateFunction1
    • case class Count(child: Expression) extends UnaryExpression with PartialAggregate1
    • case class CountFunction(expr: Expression, base: AggregateExpression1) extends AggregateFunction1
    • case class CountDistinct(expressions: Seq[Expression]) extends PartialAggregate1
    • case class CollectHashSet(expressions: Seq[Expression]) extends AggregateExpression1
    • case class CombineSetsAndCount(inputSet: Expression) extends AggregateExpression1
    • case class Average(child: Expression) extends UnaryExpression with PartialAggregate1
    • case class AverageFunction(expr: Expression, base: AggregateExpression1)
    • case class Sum(child: Expression) extends UnaryExpression with PartialAggregate1
    • case class SumFunction(expr: Expression, base: AggregateExpression1) extends AggregateFunction1
    • case class CombineSum(child: Expression) extends AggregateExpression1
    • case class CombineSumFunction(expr: Expression, base: AggregateExpression1)
    • case class SumDistinct(child: Expression) extends UnaryExpression with PartialAggregate1
    • case class SumDistinctFunction(expr: Expression, base: AggregateExpression1)
    • case class CombineSetsAndSum(inputSet: Expression, base: Expression) extends AggregateExpression1
    • case class First(child: Expression) extends UnaryExpression with PartialAggregate1
    • case class FirstFunction(expr: Expression, base: AggregateExpression1) extends AggregateFunction1
    • case class Last(child: Expression) extends UnaryExpression with PartialAggregate1
    • case class LastFunction(expr: Expression, base: AggregateExpression1) extends AggregateFunction1
    • case class FormatString(children: Expression*) extends Expression with ImplicitCastInputTypes
    • case class Aggregate2Sort(
    • case class FinalAndCompleteAggregate2Sort(
    • class GroupingIterator(
    • class PartialSortAggregationIterator(
    • class PartialMergeSortAggregationIterator(
    • class FinalSortAggregationIterator(
    • class FinalAndCompleteSortAggregationIterator(
    • abstract class UserDefinedAggregateFunction extends Serializable
    • case class ScalaUDAF(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this? Declaring a zero-arg apply method is pretty error prone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a public API?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah damn. Maybe deprecate it for now? Take a look to see how many places use it. If possible, let's remove its usage.

@davies davies changed the title [SQL] [WIP] remove unlimited precision support for DecimalType [SPARK-9069] [SQL] remove unlimited precision support for DecimalType Jul 23, 2015
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is different from our offline discussion, isn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will break some tests, so I'd like to do it separately, https://issues.apache.org/jira/browse/SPARK-9281?filter=-2.

@rxin
Copy link
Contributor

rxin commented Jul 23, 2015

@davies can you point out which part of the code changes require more careful review? Most of the changes are just replacing Unlimited with Maximum.

@SparkQA
Copy link

SparkQA commented Jul 23, 2015

Test build #38161 has finished for PR 7605 at commit 1779bde.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 23, 2015

Test build #38168 has finished for PR 7605 at commit 788631c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 23, 2015

Test build #38191 has finished for PR 7605 at commit 8d783cc.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Davies Liu added 2 commits July 23, 2015 08:20
Conflicts:
	sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeFixedWidthAggregationMapSuite.scala
	sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala
@davies
Copy link
Contributor Author

davies commented Jul 23, 2015

@rxin I think most of changes should be pay attention to, Unlimited should be replaced by Default or Maximum (they are different).

Especially how we change the precision and scale for expressions (in HiveTypeCoercion and aggregation).

@davies davies changed the title [SPARK-9069] [SQL] remove unlimited precision support for DecimalType [SPARK-9069] [SPARK-9264] [SQL] remove unlimited precision support for DecimalType Jul 23, 2015
@SparkQA
Copy link

SparkQA commented Jul 23, 2015

Test build #38234 has finished for PR 7605 at commit 06727fd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we take Double as the higher type for Decimal?

@SparkQA
Copy link

SparkQA commented Jul 23, 2015

Test build #38266 has finished for PR 7605 at commit bfaae35.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ChangePrecision(child: Expression) extends UnaryExpression
    • case class DecimalType(precision: Int, scale: Int) extends FractionalType
    • case class DecimalConversion(precision: Int, scale: Int) extends JDBCConversion

@SparkQA
Copy link

SparkQA commented Jul 23, 2015

Test build #38274 has finished for PR 7605 at commit fb0d20d.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ChangePrecision(child: Expression) extends UnaryExpression
    • case class DecimalType(precision: Int, scale: Int) extends FractionalType
    • case class DecimalConversion(precision: Int, scale: Int) extends JDBCConversion

@SparkQA
Copy link

SparkQA commented Jul 23, 2015

Test build #38276 has finished for PR 7605 at commit aa3f115.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ChangePrecision(child: Expression) extends UnaryExpression
    • case class DecimalType(precision: Int, scale: Int) extends FractionalType
    • case class DecimalConversion(precision: Int, scale: Int) extends JDBCConversion

@davies
Copy link
Contributor Author

davies commented Jul 23, 2015

@rxin Could you take another round on this? especially for HiveTypeCoercion

@SparkQA
Copy link

SparkQA commented Jul 24, 2015

Test build #1190 has finished for PR 7605 at commit aa3f115.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should update the documentation for findTightestCommonTypeOfTwo, which still says we don't do anything w.r.t. decimal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how come you added this, but not for fractional type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok see my comment below -- i think you want to handle them here as well.

@rxin
Copy link
Contributor

rxin commented Jul 24, 2015

I find it confusing SYSTEM_DEFAULT vs USER_DEFAULT. It is unclear what the semantics are. I think what you had previously - Default and Max was easier to understand.

@rxin
Copy link
Contributor

rxin commented Jul 24, 2015

@davies since this is a larger patch, I'm going to merge this first.

@asfgit asfgit closed this in 8a94eb2 Jul 24, 2015
@davies
Copy link
Contributor Author

davies commented Jul 24, 2015

@rxin Actually (38, 18) is not the maximum one, (38, 38) may be. (10, 0) is the default one when user define a schema, (38, 18) is the one when we infer the type from BigDecimal, or other cases that we use it as default one internally, because it has balanced precision on range (left side) and scale (right side). The names is borrowed from Hive, I found them make more sense than DEFAULT and MAXIMUM.

asfgit pushed a commit that referenced this pull request Jul 24, 2015
Address comments for #7605

cc rxin

Author: Davies Liu <davies@databricks.com>

Closes #7634 from davies/decimal_unlimited2 and squashes the following commits:

b2d8b0d [Davies Liu] add doc and test for DecimalType.isWiderThan
65b251c [Davies Liu] fix test
6a91f32 [Davies Liu] fix style
ca9c973 [Davies Liu] address comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants