Skip to content

Conversation

@petermaxlee
Copy link
Contributor

What changes were proposed in this pull request?

Spark currently throws exceptions for invalid casts for all other data types except date type. Somehow date type returns null. It should be consistent and throws analysis exception as well.

How was this patch tested?

Added a unit test case in CastSuite.

@SparkQA
Copy link

SparkQA commented Jul 26, 2016

Test build #62857 has finished for PR 14358 at commit 5419b85.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@petermaxlee
Copy link
Contributor Author

@cloud-fan can you take a look?

// It is never possible to compare result when hive return with exception,
// so we can return null
// NULL is more reasonable here, since the query itself obeys the grammar.
case _ => _ => null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @yhuai @liancheng , do you remember why we have this behaviour at the beginning?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a design decision made in the original PR. See here: https://github.com/apache/spark/pull/2344/files#diff-258b71121d8d168e4d53cb5b6dc53ffeR166

I don't think we've ever discussed this case explicitly. This change seems reasonable to me.

@petermaxlee
Copy link
Contributor Author

Is this good to merge?

// to ensure we test every possible cast situation here
atomicTypes.zip(atomicTypes).foreach { case (from, to) =>
checkNullCast(from, to)
if (Cast.canCast(from, to)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this check? doesn;t from always equal to to here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all atomicTypes can cast from each other? E.g. date.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def canCast(from: DataType, to: DataType): Boolean = (from, to) match {
  case (fromType, toType) if fromType == toType => true
  ......

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah this is doing self casting - i read it wrong. let me remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@cloud-fan
Copy link
Contributor

LGTM, pending jenkins.

@SparkQA
Copy link

SparkQA commented Jul 27, 2016

Test build #62910 has finished for PR 14358 at commit 0161896.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@asfgit asfgit closed this in ef0ccbc Jul 27, 2016
@rxin
Copy link
Contributor

rxin commented Jul 27, 2016

I would consider this a bug and put it in branch-2.0. I'm going to cherry pick.

asfgit pushed a commit that referenced this pull request Jul 27, 2016
Spark currently throws exceptions for invalid casts for all other data types except date type. Somehow date type returns null. It should be consistent and throws analysis exception as well.

Added a unit test case in CastSuite.

Author: petermaxlee <petermaxlee@gmail.com>

Closes #14358 from petermaxlee/SPARK-16729.

(cherry picked from commit ef0ccbc)
Signed-off-by: Reynold Xin <rxin@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants