Skip to content

Conversation

@chenghao-intel
Copy link
Contributor

When do the query like:

select datediff(cast(value as timestamp), cast('2002-03-21 00:00:00' as timestamp)) from src;

SparkSQL will raise exception:

[info] scala.MatchError: TimestampType (of class org.apache.spark.sql.catalyst.types.TimestampType$)
[info] at org.apache.spark.sql.catalyst.expressions.Cast.castToTimestamp(Cast.scala:77)
[info] at org.apache.spark.sql.catalyst.expressions.Cast.cast$lzycompute(Cast.scala:251)
[info] at org.apache.spark.sql.catalyst.expressions.Cast.cast(Cast.scala:247)
[info] at org.apache.spark.sql.catalyst.expressions.Cast.eval(Cast.scala:263)
[info] at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$5$$anonfun$applyOrElse$2.applyOrElse(Optimizer.scala:217)
[info] at org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$5$$anonfun$applyOrElse$2.applyOrElse(Optimizer.scala:210)
[info] at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)
[info] at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4$$anonfun$apply$2.apply(TreeNode.scala:180)
[info] at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
[info] at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)

@chenghao-intel chenghao-intel changed the title [SPARK3501] [SQL] Fix the bug of Hive SimpleUDF creates unnecessary type cast [SPARK-3501] [SQL] Fix the bug of Hive SimpleUDF creates unnecessary type cast Sep 12, 2014
@chenghao-intel
Copy link
Contributor Author

test this please.

@SparkQA
Copy link

SparkQA commented Sep 12, 2014

QA tests have started for PR 2368 at commit b834ed4.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 12, 2014

QA tests have finished for PR 2368 at commit b834ed4.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd structure this as a guard to call out the two cases more clearly:

case (e, t) if (e.dataType == t) => e
case (e, t) => Cast(e, t)

@marmbrus
Copy link
Contributor

Thanks for fixing this! While I think eliminating the problem is great, it also seems like a bug to me that casting a timestamp to a timestamp throws an exception since none of the other datatypes do that. Mind adding a no-op case to the eval path as well?

@SparkQA
Copy link

SparkQA commented Sep 15, 2014

QA tests have started for PR 2368 at commit 330a5c8.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 15, 2014

QA tests have started for PR 2368 at commit b804abd.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 15, 2014

Tests timed out after a configured wait of 120m.

@SparkQA
Copy link

SparkQA commented Sep 15, 2014

Tests timed out after a configured wait of 120m.

@chenghao-intel
Copy link
Contributor Author

retest this please.

@SparkQA
Copy link

SparkQA commented Sep 15, 2014

QA tests have started for PR 2368 at commit b804abd.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 15, 2014

Tests timed out after a configured wait of 120m.

@SparkQA
Copy link

SparkQA commented Sep 16, 2014

QA tests have started for PR 2368 at commit 3f15731.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 16, 2014

QA tests have finished for PR 2368 at commit 3f15731.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@chenghao-intel
Copy link
Contributor Author

You're right @marmbrus . the expression should work even without being optimized. I've updated the no-op case in Cast and revert the change in the Optimizer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps:

  private[this] lazy val cast: Any => Any = dataType match {
    case dt if dt == child.dataType => identity[Any] _
    case StringType => castToString
    case BinaryType => castToBinary
    case DecimalType => castToDecimal
    case TimestampType => castToTimestamp
    case BooleanType => castToBoolean
    case ByteType => castToByte
    case ShortType => castToShort
    case IntegerType => castToInt
    case FloatType => castToFloat
    case LongType => castToLong
    case DoubleType => castToDouble
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @marmbrus , it's more clean. :)

@marmbrus
Copy link
Contributor

Minor style suggestion, otherwise LGTM. Thanks!

@SparkQA
Copy link

SparkQA commented Sep 17, 2014

QA tests have started for PR 2368 at commit 5c9c3a5.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 17, 2014

QA tests have finished for PR 2368 at commit 5c9c3a5.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus
Copy link
Contributor

Thanks! Merged to master.

@asfgit asfgit closed this in 2c3cc76 Sep 19, 2014
@chenghao-intel chenghao-intel deleted the cast_exception branch October 9, 2014 04:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants