[SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDTF/UDAF #18792

maropu · 2017-08-01T03:08:03Z

What changes were proposed in this pull request?

This pr added documents about unsupported functions in Hive UDF/UDTF/UDAF.
This pr relates to #18768 and #18527.

How was this patch tested?

N/A

maropu · 2017-08-01T03:08:43Z

@gatorsmile If you get time, could you check this? Thanks!

gatorsmile · 2017-08-01T03:17:36Z

docs/sql-programming-guide.md

+Some of them are meaningless in Spark and the others are rarely used by users.
+Below is a list of major APIs we don't support in Spark SQL:
+
+* `getRequiredJars` and `getRequiredFiles` (`UDF` and `GenericUDF`) are functions to to automatically


to to -> to

gatorsmile · 2017-08-01T03:20:02Z

docs/sql-programming-guide.md

+* `initialize(StructObjectInspector)` in `GenericUDTF` is not supported yet. Spark SQL currently uses
+  a deprecated interface `initialize(ObjectInspector[])` only.
+* `configure` (`GenericUDF`, `GenericUDTF`, and `GenericUDAFEvaluator`) is a function to initialize
+  functions with `MapredContext`. But, Spark SQL does not use `MapredContext` internally.


functions with MapredContext, which is inapplicable to Spark.

gatorsmile · 2017-08-01T03:20:43Z

docs/sql-programming-guide.md

+* `configure` (`GenericUDF`, `GenericUDTF`, and `GenericUDAFEvaluator`) is a function to initialize
+  functions with `MapredContext`. But, Spark SQL does not use `MapredContext` internally.
+* `close` (`GenericUDF` and `GenericUDAFEvaluator`) is a function to release associated resources.
+  Spark SQL does not call this function when tasks finished.


finished -> finish

gatorsmile · 2017-08-01T03:22:41Z

docs/sql-programming-guide.md

+* `reset` (`GenericUDAFEvaluator`) is a function to re-initialize aggregation for reusing the same aggregation.
+  Spark SQL currently does not support the reuse of aggregation.
+* `getWindowingEvaluator` (`GenericUDAFEvaluator`) is a function to optimize aggregation by evaluating
+  an aggregate over a fixed window. Spark SQL does not support this optimization yet.


Please remove Spark SQL does not support this optimization yet

SparkQA · 2017-08-01T03:23:12Z

Test build #80103 has finished for PR 18792 at commit 1434bde.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-08-01T03:24:02Z

docs/sql-programming-guide.md

+
+Spark SQL implements the basic functionality of the Hive UDF/UDTF/UDAF, but does not support all the APIs for users.
+Some of them are meaningless in Spark and the others are rarely used by users.
+Below is a list of major APIs we don't support in Spark SQL:


How about simplifying the whole paragraph to?

Not all the APIs of the Hive UDF/UDTF/UDAF are supported by Spark SQL. Below are the unsupported APIs:

gatorsmile · 2017-08-01T03:28:54Z

Thanks for working on it! Just left some minor comments.

maropu · 2017-08-01T04:25:31Z

@gatorsmile ok, fixed.

SparkQA · 2017-08-01T04:33:17Z

Test build #80106 has finished for PR 18792 at commit 7d07e6b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-08-01T04:38:14Z

Test build #80107 has finished for PR 18792 at commit c703d57.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2017-08-01T05:41:13Z

docs/sql-programming-guide.md

+* `initialize(StructObjectInspector)` in `GenericUDTF` is not supported yet. Spark SQL currently uses
+  a deprecated interface `initialize(ObjectInspector[])` only.
+* `configure` (`GenericUDF`, `GenericUDTF`, and `GenericUDAFEvaluator`) is a function to initialize
+  functions with `MapredContext`, which is inapplicable to Spark. But, Spark SQL does not use `MapredContext` internally.


nit: But looks redundant here, because there's inapplicable before. Looks like negative to negative...

removed. Thanks!

gatorsmile · 2017-08-01T05:49:40Z

LGTM pending Jenkins

SparkQA · 2017-08-01T05:58:17Z

Test build #80109 has finished for PR 18792 at commit 29f1108.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

Add documents about Hive UDFs

1434bde

gatorsmile reviewed Aug 1, 2017

View reviewed changes

Apply fixes

c703d57

maropu force-pushed the HOTFIX-20170731 branch from 7d07e6b to c703d57 Compare August 1, 2017 04:23

viirya reviewed Aug 1, 2017

View reviewed changes

Fix

29f1108

asfgit closed this in 110695d Aug 1, 2017

[SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDTF/UDAF #18792

[SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDTF/UDAF #18792

Uh oh!

Conversation

maropu commented Aug 1, 2017

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

maropu commented Aug 1, 2017

Uh oh!

gatorsmile Aug 1, 2017

Choose a reason for hiding this comment

Uh oh!

gatorsmile Aug 1, 2017

Choose a reason for hiding this comment

Uh oh!

gatorsmile Aug 1, 2017

Choose a reason for hiding this comment

Uh oh!

gatorsmile Aug 1, 2017

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Aug 1, 2017

Uh oh!

gatorsmile Aug 1, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gatorsmile commented Aug 1, 2017

Uh oh!

maropu commented Aug 1, 2017

Uh oh!

SparkQA commented Aug 1, 2017

Uh oh!

SparkQA commented Aug 1, 2017

Uh oh!

viirya Aug 1, 2017

Choose a reason for hiding this comment

Uh oh!

maropu Aug 1, 2017

Choose a reason for hiding this comment

Uh oh!

gatorsmile commented Aug 1, 2017

Uh oh!

SparkQA commented Aug 1, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gatorsmile Aug 1, 2017 •

edited

Loading