Skip to content

Conversation

@hvanhovell
Copy link
Contributor

What changes were proposed in this pull request?

Currently an unqualified getFunction(..)call returns a wrong result; the returned function is shown as temporary function without a database. For example:

scala> sql("create function fn1 as 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFAbs'")
res0: org.apache.spark.sql.DataFrame = []

scala> spark.catalog.getFunction("fn1")
res1: org.apache.spark.sql.catalog.Function = Function[name='fn1', className='org.apache.hadoop.hive.ql.udf.generic.GenericUDFAbs', isTemporary='true']

This PR fixes this by adding database information to ExpressionInfo (which is used to store the function information).

How was this patch tested?

Added more thorough tests to CatalogSuite.

@hvanhovell
Copy link
Contributor Author

cc @srinathshankar

@SparkQA
Copy link

SparkQA commented Oct 19, 2016

Test build #67161 has finished for PR 15542 at commit 6068e44.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 19, 2016

Test build #67171 has finished for PR 15542 at commit 5597d25.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@srinathshankar srinathshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments

return db;
}

public ExpressionInfo(String className, String name, String usage, String extended, String db) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could we put the db name before the expression name in the params ? Not a big deal if this requires more surgery.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

case _ =>
try {
val info = sparkSession.sessionState.catalog.lookupFunctionInfo(functionName)
val db = if (info.getDb != null) info.getDb + "." else ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just inline the definition of val db into the definition of val name ? You don't use it anywhere else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

description = null, // for now, this is always undefined
className = metadata.getClassName,
isTemporary = funcIdent.database.isEmpty)
isTemporary = metadata.getDb == null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still the right way to do this ? What about global temp tables ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not support global temp functions. So this is the right way to do it.

@SparkQA
Copy link

SparkQA commented Nov 1, 2016

Test build #67883 has finished for PR 15542 at commit e8d1a7e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@hvanhovell
Copy link
Contributor Author

Merging to master. Thanks for the review.

@asfgit asfgit closed this in f7c145d Nov 1, 2016
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
## What changes were proposed in this pull request?

Currently an unqualified `getFunction(..)`call returns a wrong result; the returned function is shown as temporary function without a database. For example:

```
scala> sql("create function fn1 as 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFAbs'")
res0: org.apache.spark.sql.DataFrame = []

scala> spark.catalog.getFunction("fn1")
res1: org.apache.spark.sql.catalog.Function = Function[name='fn1', className='org.apache.hadoop.hive.ql.udf.generic.GenericUDFAbs', isTemporary='true']
```

This PR fixes this by adding database information to ExpressionInfo (which is used to store the function information).
## How was this patch tested?

Added more thorough tests to `CatalogSuite`.

Author: Herman van Hovell <hvanhovell@databricks.com>

Closes apache#15542 from hvanhovell/SPARK-17996.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants