-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDTF/UDAF #18792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@gatorsmile If you get time, could you check this? Thanks! |
docs/sql-programming-guide.md
Outdated
| Some of them are meaningless in Spark and the others are rarely used by users. | ||
| Below is a list of major APIs we don't support in Spark SQL: | ||
|
|
||
| * `getRequiredJars` and `getRequiredFiles` (`UDF` and `GenericUDF`) are functions to to automatically |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to to -> to
docs/sql-programming-guide.md
Outdated
| * `initialize(StructObjectInspector)` in `GenericUDTF` is not supported yet. Spark SQL currently uses | ||
| a deprecated interface `initialize(ObjectInspector[])` only. | ||
| * `configure` (`GenericUDF`, `GenericUDTF`, and `GenericUDAFEvaluator`) is a function to initialize | ||
| functions with `MapredContext`. But, Spark SQL does not use `MapredContext` internally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
functions with
MapredContext, which is inapplicable to Spark.
docs/sql-programming-guide.md
Outdated
| * `configure` (`GenericUDF`, `GenericUDTF`, and `GenericUDAFEvaluator`) is a function to initialize | ||
| functions with `MapredContext`. But, Spark SQL does not use `MapredContext` internally. | ||
| * `close` (`GenericUDF` and `GenericUDAFEvaluator`) is a function to release associated resources. | ||
| Spark SQL does not call this function when tasks finished. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finished -> finish
docs/sql-programming-guide.md
Outdated
| * `reset` (`GenericUDAFEvaluator`) is a function to re-initialize aggregation for reusing the same aggregation. | ||
| Spark SQL currently does not support the reuse of aggregation. | ||
| * `getWindowingEvaluator` (`GenericUDAFEvaluator`) is a function to optimize aggregation by evaluating | ||
| an aggregate over a fixed window. Spark SQL does not support this optimization yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove Spark SQL does not support this optimization yet
|
Test build #80103 has finished for PR 18792 at commit
|
docs/sql-programming-guide.md
Outdated
|
|
||
| Spark SQL implements the basic functionality of the Hive UDF/UDTF/UDAF, but does not support all the APIs for users. | ||
| Some of them are meaningless in Spark and the others are rarely used by users. | ||
| Below is a list of major APIs we don't support in Spark SQL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about simplifying the whole paragraph to?
Not all the APIs of the Hive UDF/UDTF/UDAF are supported by Spark SQL. Below are the unsupported APIs:
|
Thanks for working on it! Just left some minor comments. |
|
@gatorsmile ok, fixed. |
|
Test build #80106 has finished for PR 18792 at commit
|
|
Test build #80107 has finished for PR 18792 at commit
|
docs/sql-programming-guide.md
Outdated
| * `initialize(StructObjectInspector)` in `GenericUDTF` is not supported yet. Spark SQL currently uses | ||
| a deprecated interface `initialize(ObjectInspector[])` only. | ||
| * `configure` (`GenericUDF`, `GenericUDTF`, and `GenericUDAFEvaluator`) is a function to initialize | ||
| functions with `MapredContext`, which is inapplicable to Spark. But, Spark SQL does not use `MapredContext` internally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: But looks redundant here, because there's inapplicable before. Looks like negative to negative...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed. Thanks!
|
LGTM pending Jenkins |
|
Test build #80109 has finished for PR 18792 at commit
|
What changes were proposed in this pull request?
This pr added documents about unsupported functions in Hive UDF/UDTF/UDAF.
This pr relates to #18768 and #18527.
How was this patch tested?
N/A