-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-28282][SQL][PYTHON][TESTS] Convert and port 'inline-table.sql' into UDF test base #25124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
54 changes: 54 additions & 0 deletions
54
sql/core/src/test/resources/sql-tests/inputs/udf/udf-inline-table.sql
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
| -- This test file was converted from inline-table.sql. | ||
| -- [SPARK-28291] UDFs cannot be evaluated within inline table definition | ||
| -- TODO: We should add UDFs in VALUES clause when [SPARK-28291] is resolved. | ||
|
|
||
| -- single row, without table and column alias | ||
| select udf(col1), udf(col2) from values ("one", 1); | ||
|
|
||
| -- single row, without column alias | ||
| select udf(col1), udf(udf(col2)) from values ("one", 1) as data; | ||
|
|
||
| -- single row | ||
| select udf(a), b from values ("one", 1) as data(a, b); | ||
imback82 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| -- single column multiple rows | ||
| select udf(a) from values 1, 2, 3 as data(a); | ||
|
|
||
| -- three rows | ||
| select udf(a), b from values ("one", 1), ("two", 2), ("three", null) as data(a, b); | ||
|
|
||
| -- null type | ||
| select udf(a), b from values ("one", null), ("two", null) as data(a, b); | ||
|
|
||
| -- int and long coercion | ||
| select udf(a), b from values ("one", 1), ("two", 2L) as data(a, b); | ||
|
|
||
| -- foldable expressions | ||
| select udf(udf(a)), udf(b) from values ("one", 1 + 0), ("two", 1 + 3L) as data(a, b); | ||
|
|
||
| -- complex types | ||
| select udf(a), b from values ("one", array(0, 1)), ("two", array(2, 3)) as data(a, b); | ||
|
|
||
| -- decimal and double coercion | ||
| select udf(a), b from values ("one", 2.0), ("two", 3.0D) as data(a, b); | ||
|
|
||
| -- error reporting: nondeterministic function rand | ||
| select udf(a), b from values ("one", rand(5)), ("two", 3.0D) as data(a, b); | ||
|
|
||
| -- error reporting: different number of columns | ||
| select udf(a), udf(b) from values ("one", 2.0), ("two") as data(a, b); | ||
|
|
||
| -- error reporting: types that are incompatible | ||
| select udf(a), udf(b) from values ("one", array(0, 1)), ("two", struct(1, 2)) as data(a, b); | ||
|
|
||
| -- error reporting: number aliases different from number data values | ||
| select udf(a), udf(b) from values ("one"), ("two") as data(a, b); | ||
|
|
||
| -- error reporting: unresolved expression | ||
| select udf(a), udf(b) from values ("one", random_not_exist_func(1)), ("two", 2) as data(a, b); | ||
|
|
||
| -- error reporting: aggregate expression | ||
| select udf(a), udf(b) from values ("one", count(1)), ("two", 2) as data(a, b); | ||
|
|
||
| -- string to timestamp | ||
imback82 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| select udf(a), b from values (timestamp('1991-12-06 00:00:00.0'), array(timestamp('1991-12-06 01:00:00.0'), timestamp('1991-12-06 12:00:00.0'))) as data(a, b); | ||
153 changes: 153 additions & 0 deletions
153
sql/core/src/test/resources/sql-tests/results/udf/udf-inline-table.sql.out
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,153 @@ | ||
| -- Automatically generated by SQLQueryTestSuite | ||
| -- Number of queries: 17 | ||
|
|
||
|
|
||
| -- !query 0 | ||
| select udf(col1), udf(col2) from values ("one", 1) | ||
| -- !query 0 schema | ||
| struct<CAST(udf(cast(col1 as string)) AS STRING):string,CAST(udf(cast(col2 as string)) AS INT):int> | ||
| -- !query 0 output | ||
| one 1 | ||
|
|
||
|
|
||
| -- !query 1 | ||
| select udf(col1), udf(udf(col2)) from values ("one", 1) as data | ||
| -- !query 1 schema | ||
| struct<CAST(udf(cast(col1 as string)) AS STRING):string,CAST(udf(cast(cast(udf(cast(col2 as string)) as int) as string)) AS INT):int> | ||
| -- !query 1 output | ||
| one 1 | ||
|
|
||
|
|
||
| -- !query 2 | ||
| select udf(a), b from values ("one", 1) as data(a, b) | ||
| -- !query 2 schema | ||
| struct<CAST(udf(cast(a as string)) AS STRING):string,b:int> | ||
| -- !query 2 output | ||
| one 1 | ||
|
|
||
|
|
||
| -- !query 3 | ||
| select udf(a) from values 1, 2, 3 as data(a) | ||
| -- !query 3 schema | ||
| struct<CAST(udf(cast(a as string)) AS INT):int> | ||
| -- !query 3 output | ||
| 1 | ||
| 2 | ||
| 3 | ||
|
|
||
|
|
||
| -- !query 4 | ||
| select udf(a), b from values ("one", 1), ("two", 2), ("three", null) as data(a, b) | ||
| -- !query 4 schema | ||
| struct<CAST(udf(cast(a as string)) AS STRING):string,b:int> | ||
| -- !query 4 output | ||
| one 1 | ||
| three NULL | ||
| two 2 | ||
|
|
||
|
|
||
| -- !query 5 | ||
| select udf(a), b from values ("one", null), ("two", null) as data(a, b) | ||
| -- !query 5 schema | ||
| struct<CAST(udf(cast(a as string)) AS STRING):string,b:null> | ||
| -- !query 5 output | ||
| one NULL | ||
| two NULL | ||
|
|
||
|
|
||
| -- !query 6 | ||
| select udf(a), b from values ("one", 1), ("two", 2L) as data(a, b) | ||
| -- !query 6 schema | ||
| struct<CAST(udf(cast(a as string)) AS STRING):string,b:bigint> | ||
| -- !query 6 output | ||
| one 1 | ||
| two 2 | ||
|
|
||
|
|
||
| -- !query 7 | ||
| select udf(udf(a)), udf(b) from values ("one", 1 + 0), ("two", 1 + 3L) as data(a, b) | ||
| -- !query 7 schema | ||
| struct<CAST(udf(cast(cast(udf(cast(a as string)) as string) as string)) AS STRING):string,CAST(udf(cast(b as string)) AS BIGINT):bigint> | ||
| -- !query 7 output | ||
| one 1 | ||
| two 4 | ||
|
|
||
|
|
||
| -- !query 8 | ||
| select udf(a), b from values ("one", array(0, 1)), ("two", array(2, 3)) as data(a, b) | ||
| -- !query 8 schema | ||
| struct<CAST(udf(cast(a as string)) AS STRING):string,b:array<int>> | ||
| -- !query 8 output | ||
| one [0,1] | ||
| two [2,3] | ||
|
|
||
|
|
||
| -- !query 9 | ||
| select udf(a), b from values ("one", 2.0), ("two", 3.0D) as data(a, b) | ||
| -- !query 9 schema | ||
| struct<CAST(udf(cast(a as string)) AS STRING):string,b:double> | ||
| -- !query 9 output | ||
| one 2.0 | ||
| two 3.0 | ||
|
|
||
|
|
||
| -- !query 10 | ||
| select udf(a), b from values ("one", rand(5)), ("two", 3.0D) as data(a, b) | ||
| -- !query 10 schema | ||
| struct<> | ||
| -- !query 10 output | ||
| org.apache.spark.sql.AnalysisException | ||
| cannot evaluate expression rand(5) in inline table definition; line 1 pos 37 | ||
|
|
||
|
|
||
| -- !query 11 | ||
| select udf(a), udf(b) from values ("one", 2.0), ("two") as data(a, b) | ||
| -- !query 11 schema | ||
| struct<> | ||
| -- !query 11 output | ||
| org.apache.spark.sql.AnalysisException | ||
| expected 2 columns but found 1 columns in row 1; line 1 pos 27 | ||
|
|
||
|
|
||
| -- !query 12 | ||
| select udf(a), udf(b) from values ("one", array(0, 1)), ("two", struct(1, 2)) as data(a, b) | ||
| -- !query 12 schema | ||
| struct<> | ||
| -- !query 12 output | ||
| org.apache.spark.sql.AnalysisException | ||
| incompatible types found in column b for inline table; line 1 pos 27 | ||
|
|
||
|
|
||
| -- !query 13 | ||
| select udf(a), udf(b) from values ("one"), ("two") as data(a, b) | ||
| -- !query 13 schema | ||
| struct<> | ||
| -- !query 13 output | ||
| org.apache.spark.sql.AnalysisException | ||
| expected 2 columns but found 1 columns in row 0; line 1 pos 27 | ||
|
|
||
|
|
||
| -- !query 14 | ||
| select udf(a), udf(b) from values ("one", random_not_exist_func(1)), ("two", 2) as data(a, b) | ||
| -- !query 14 schema | ||
| struct<> | ||
| -- !query 14 output | ||
| org.apache.spark.sql.AnalysisException | ||
| Undefined function: 'random_not_exist_func'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.; line 1 pos 42 | ||
|
|
||
|
|
||
| -- !query 15 | ||
| select udf(a), udf(b) from values ("one", count(1)), ("two", 2) as data(a, b) | ||
| -- !query 15 schema | ||
| struct<> | ||
| -- !query 15 output | ||
| org.apache.spark.sql.AnalysisException | ||
| cannot evaluate expression count(1) in inline table definition; line 1 pos 42 | ||
|
|
||
|
|
||
| -- !query 16 | ||
| select udf(a), b from values (timestamp('1991-12-06 00:00:00.0'), array(timestamp('1991-12-06 01:00:00.0'), timestamp('1991-12-06 12:00:00.0'))) as data(a, b) | ||
| -- !query 16 schema | ||
| struct<CAST(udf(cast(a as string)) AS TIMESTAMP):timestamp,b:array<timestamp>> | ||
| -- !query 16 output | ||
| 1991-12-06 00:00:00 [1991-12-06 01:00:00.0,1991-12-06 12:00:00.0] |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@imback82, can we add a todo that says we should add UDFs in VALUES clause when SPARK-28291 is resolved? It's kind of sad because I actually intended to test such cases here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g.: