Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -481,12 +481,12 @@ case class Add(
// TODO: do not reorder consecutive `Add`s with different `evalMode`
val reorderResult =
orderCommutative({ case Add(l, r, _) => Seq(l, r) }).reduce(Add(_, _, evalMode))
if (resolved && reorderResult.resolved && reorderResult.dataType == dataType) {
reorderResult
if (resolved && reorderResult.resolved && reorderResult.dataType != dataType) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for safe, can we do this for decimal type only ? e.g

left match {
  case _: DecimalType if resolved && reorderResult.resolved && reorderResult.dataType != dataType =>
    Cast(reorderResult, dataType)
  case _ => reorderResult
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ulysses-you the current code seems fine. If there is a new data type in the future, we can still avoid the issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not worry about new data types, just want to avoid unnecessary logic for some unrelated data types like integer, float ..

// SPARK-40903: Append cast for the canonicalization of decimal Add if the result data type is
// changed. Otherwise, it may cause data checking error within ComplexTypeMergingExpression.
Cast(reorderResult, dataType)
Copy link
Contributor

@peter-toth peter-toth Nov 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the same idea that has come up previously: #38379 (comment) but my concerns (#38379 (comment)) were wrong, so LGTM.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the cast is better after seconds thought.

} else {
// SPARK-40903: Avoid reordering decimal Add for canonicalization if the result data type is
// changed, which may cause data checking error within ComplexTypeMergingExpression.
withCanonicalizedChildren
reorderResult
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -206,4 +206,16 @@ class CanonicalizeSuite extends SparkFunSuite {
assert(!Add(Add(literal4, literal5), literal1).semanticEquals(
Add(Add(literal1, literal5), literal4)))
}

test("SPARK-40903: Append Casting if the canonicalization result type is changed") {
val d = Decimal(1.2)
val literal1 = Literal.create(d, DecimalType(2, 1))
val literal2 = Literal.create(d, DecimalType(12, 5))
val literal3 = Literal.create(d, DecimalType(12, 6))
val add1 = Add(literal1, Add(literal2, literal3)).canonicalized
val add2 = Add(Add(literal3, literal2), literal1).canonicalized
assert(add1.isInstanceOf[Cast] && add1.dataType == DecimalType(15, 6))
assert(add2.isInstanceOf[Cast] && add2.dataType == DecimalType(15, 6))
assert(add1 == add2)
}
}