Skip to content

Conversation

@huaxingao
Copy link
Contributor

@huaxingao huaxingao commented Jul 11, 2019

What changes were proposed in this pull request?

This PR adds some tests converted from outer-join.sql to test UDFs. Please see contribution guide of this umbrella ticket - SPARK-27921.

Diff comparing to 'outer-join.sql'

diff --git a/sql/core/src/test/resources/sql-tests/results/outer-join.sql.out b/sql/core/src/test/resources/sql-tests/results/udf/udf-outer-join.sql.out
index 5db3bae5d0..819f786070 100644
--- a/sql/core/src/test/resources/sql-tests/results/outer-join.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/udf/udf-outer-join.sql.out
@@ -24,17 +24,17 @@ struct<>
 
 -- !query 2
 SELECT
-  (SUM(COALESCE(t1.int_col1, t2.int_col0))),
-     ((COALESCE(t1.int_col1, t2.int_col0)) * 2)
+  (udf(SUM(udf(COALESCE(t1.int_col1, t2.int_col0))))),
+     (udf(COALESCE(t1.int_col1, t2.int_col0)) * 2)
 FROM t1
 RIGHT JOIN t2
-  ON (t2.int_col0) = (t1.int_col1)
-GROUP BY GREATEST(COALESCE(t2.int_col1, 109), COALESCE(t1.int_col1, -449)),
+  ON udf(t2.int_col0) = udf(t1.int_col1)
+GROUP BY udf(GREATEST(COALESCE(udf(t2.int_col1), 109), COALESCE(t1.int_col1, udf(-449)))),
          COALESCE(t1.int_col1, t2.int_col0)
-HAVING (SUM(COALESCE(t1.int_col1, t2.int_col0)))
-            > ((COALESCE(t1.int_col1, t2.int_col0)) * 2)
+HAVING (udf(SUM(COALESCE(udf(t1.int_col1), udf(t2.int_col0)))))
+            > (udf(COALESCE(t1.int_col1, t2.int_col0)) * 2)
 -- !query 2 schema
-struct<sum(coalesce(int_col1, int_col0)):bigint,(coalesce(int_col1, int_col0) * 2):int>
+struct<CAST(udf(cast(sum(cast(cast(udf(cast(coalesce(int_col1, int_col0) as string)) as int) as bigint)) as string)) AS BIGINT):bigint,(CAST(udf(cast(coalesce(int_col1, int_col0) as string)) AS INT) * 2):int>
 -- !query 2 output
 -367   -734
 -507   -1014
@@ -70,10 +70,10 @@ spark.sql.crossJoin.enabled true
 SELECT *
 FROM (
 SELECT
-    COALESCE(t2.int_col1, t1.int_col1) AS int_col
+    udf(COALESCE(udf(t2.int_col1), udf(t1.int_col1))) AS int_col
     FROM t1
     LEFT JOIN t2 ON false
-) t where (t.int_col) is not null
+) t where (udf(t.int_col)) is not null
 -- !query 6 schema
 struct<int_col:int>
 -- !query 6 output

How was this patch tested?

Tested as guided in SPARK-27921.

@SparkQA
Copy link

SparkQA commented Jul 11, 2019

Test build #107493 has finished for PR 25103 at commit 91d6955.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 18, 2019

Test build #107801 has finished for PR 25103 at commit 5955d46.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@HyukjinKwon
Copy link
Member

Looks good to me in general if the tests pass

@SparkQA
Copy link

SparkQA commented Jul 18, 2019

Test build #107816 has finished for PR 25103 at commit 5955d46.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@huaxingao
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Jul 19, 2019

Test build #107861 has finished for PR 25103 at commit 5955d46.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

@huaxingao, can you update the diff comparing to the original file? Looks a bit weird.

@huaxingao
Copy link
Contributor Author

diff updated. @HyukjinKwon

@HyukjinKwon
Copy link
Member

Merged to master.

@huaxingao
Copy link
Contributor Author

Thanks a lot for your help! @HyukjinKwon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants