-
Notifications
You must be signed in to change notification settings - Fork 75
when can support hive user defined udf functions? #940
Comments
Hi, @zwx109473, recently, we have added the support for a hive UDF. It works well. |
We have drafted a doc to summarize how to support hive UDF in Gazelle. See #945. |
Do you mean our implemented columnar UrlDecoder doesn't actually replace the original UDF at runtime? You can verify it by checking whether there is a columnar projection in spark DAG for your test cases. |
Thank you for your help @PHILO-HE |
Yes, the current code has not considered permanent UDF. I just found that permanent UDF will be renamed as |
Thank you for your answer. I managed to get through. @PHILO-HE |
Hi, @PHILO-HE .
Other methods are written by referring to the urldecoder function.
function_registry_string.cc add the following methods:
The spark plan is as follows:
However, an error is reported during the operation. The error is “org.apache.arrow.gandiva.exceptions.GandivaException: Failed to make LLVM module due to Function bool is_empty(string) not supported yet.”
Why can't my functions be found by the FunctionRegistry? |
In function_registry_string.cc, BTW, please note, to test your changes in arrow, you should change the arrow branch to your own's working branch for arrow in Line#65 of |
Hi,@PHILO-HE ,I made a correction according to your guidance, and a new error occurred.
|
In your arrow function,
|
@PHILO-HE ,I've removed unnecessary input parameters from the function, but I haven't fixed them. I guess it might have something to do with the cast function! |
Could you share the links for your arrow/gazelle patch? |
Hi,@PHILO-HE,his is my modification. Please check the problem. |
@PHILO-HE ,are there any new developments or suggestions? |
Sorry for this late reply. Please keep
With the above correcting, you can resolve the reported exception. By the way, I notice you tried to handle null case inside your gandiva function. Actually, for null input, gandiva will directly return null since you specified |
@PHILO-HE ,Thank you very much for your help! I have solved problem. |
We are using this native-sql-engine plugin, but in real scenarios, the built-in udf function of spark cannot meet the business requirements.
The sql statement contains many user defined udf functions, which causes the operator to fall back and the performance does not improve significantly.
Please do we have a solution to this problem? program or plan?
The text was updated successfully, but these errors were encountered: