Skip to content

Conversation

@wangyum
Copy link
Member

@wangyum wangyum commented Jul 26, 2019

What changes were proposed in this pull request?

spark-sql> select cast(1);
19/07/26 00:54:17 ERROR SparkSQLDriver: Failed in [select cast(1)]
java.lang.UnsupportedOperationException: empty.init
	at scala.collection.TraversableLike$class.init(TraversableLike.scala:451)
	at scala.collection.mutable.ArrayOps$ofInt.scala$collection$IndexedSeqOptimized$$super$init(ArrayOps.scala:234)
	at scala.collection.IndexedSeqOptimized$class.init(IndexedSeqOptimized.scala:135)
	at scala.collection.mutable.ArrayOps$ofInt.init(ArrayOps.scala:234)
	at org.apache.spark.sql.catalyst.analysis.FunctionRegistry$$anonfun$7$$anonfun$11.apply(FunctionRegistry.scala:565)
	at org.apache.spark.sql.catalyst.analysis.FunctionRegistry$$anonfun$7$$anonfun$11.apply(FunctionRegistry.scala:558)
	at scala.Option.getOrElse(Option.scala:121)

The reason is that we did not handle the case validParametersCount.length == 0 because the parameter types can be Expression, DataType and Option. This PR makes it handle the case validParametersCount.length == 0.

How was this patch tested?

unit tests

@wangyum
Copy link
Member Author

wangyum commented Jul 26, 2019

cc @mgaido91

} else {
// Otherwise, find a constructor method that matches the number of arguments, and use that.
val params = Seq.fill(expressions.size)(classOf[Expression])
val f = constructors.find(_.getParameterTypes.toSeq == params).getOrElse {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but here we're still filtering by expressions..am I missing something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two reasons:

  1. It will throw exception at CheckAnalysis if parameter is a DataType. So we do not need handle it here:
scala> spark.sql("select cast(int)")
org.apache.spark.sql.AnalysisException: cannot resolve '`int`' given input columns: []; line 1 pos 12;
'Project [unresolvedalias('cast('int), None)]
+- OneRowRelation

  at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
  at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:113)
  at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:110)
  at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$2(TreeNode.scala:306)
  1. It's hard to know a parameter is DataType or Expression here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think the problem here is not that we are not considering the DataType args, but the problem IMHO is that we are not handling the case validParametersCount == 0. I think in that case we should throw something like: Invalid arguments for function $name.. Indeed, in the cast case, I'd argue that it is not so clear to me whether the arguments are 1 or 2, since it is not cast(1, string) but cast(1 as string)... So I don't think that it is correct to state that the number of arguments for cast is 2.

@SparkQA
Copy link

SparkQA commented Jul 26, 2019

Test build #108210 has finished for PR 25261 at commit 32a671c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 31, 2019

Test build #108438 has finished for PR 25261 at commit 94af203.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@mgaido91 mgaido91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a comment on wording. Thanks.

validParametersCount.last
val expectedErrorMsg = validParametersCount.length match {
case 0 =>
""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: in case of 0, I'd prefer a more generic Invalid arguments for function $name, instead of Invalid number of arguments for function $name. Because as I mentioned earlier, I think it is questionable which is the number of arguments in this case...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

wangyum added 2 commits August 1, 2019 10:02
# Conflicts:
#	sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala
@SparkQA
Copy link

SparkQA commented Aug 1, 2019

Test build #108500 has finished for PR 25261 at commit 3c46a84.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Aug 1, 2019

Merged to master

@srowen srowen closed this in 4e7a4cd Aug 1, 2019
@wangyum wangyum deleted the SPARK-28521 branch August 1, 2019 23:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants