Commit 1e5766c
[SPARK-29462][SQL] The data type of "array()" should be array<null>
### What changes were proposed in this pull request?
This brings #26324 back. It was reverted basically because, firstly Hive compatibility, and the lack of investigations in other DBMSes and ANSI.
- In case of PostgreSQL seems coercing NULL literal to TEXT type.
- Presto seems coercing `array() + array(1)` -> array of int.
- Hive seems `array() + array(1)` -> array of strings
Given that, the design choices have been differently made for some reasons. If we pick one of both, seems coercing to array of int makes much more sense.
Another investigation was made offline internally. Seems ANSI SQL 2011, section 6.5 "<contextually typed value specification>" states:
> If ES is specified, then let ET be the element type determined by the context in which ES appears. The declared type DT of ES is Case:
>
> a) If ES simply contains ARRAY, then ET ARRAY[0].
>
> b) If ES simply contains MULTISET, then ET MULTISET.
>
> ES is effectively replaced by CAST ( ES AS DT )
From reading other related context, doing it to `NullType`. Given the investigation made, choosing to `null` seems correct, and we have a reference Presto now. Therefore, this PR proposes to bring it back.
### Why are the changes needed?
When empty array is created, it should be declared as array<null>.
### Does this PR introduce any user-facing change?
Yes, `array()` creates `array<null>`. Now `array(1) + array()` can correctly create `array(1)` instead of `array("1")`.
### How was this patch tested?
Tested manually
Closes #27521 from HyukjinKwon/SPARK-29462.
Lead-authored-by: HyukjinKwon <gurwls223@apache.org>
Co-authored-by: Aman Omer <amanomer1996@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit 0045be7)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>1 parent 8efe367 commit 1e5766c
File tree
4 files changed
+34
-5
lines changed- docs
- sql
- catalyst/src/main/scala/org/apache/spark/sql
- catalyst/expressions
- internal
- core/src/test/scala/org/apache/spark/sql
4 files changed
+34
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
| 218 | + | |
| 219 | + | |
218 | 220 | | |
219 | 221 | | |
220 | 222 | | |
| |||
Lines changed: 10 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| 26 | + | |
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
| |||
44 | 45 | | |
45 | 46 | | |
46 | 47 | | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
47 | 56 | | |
48 | 57 | | |
49 | 58 | | |
50 | | - | |
| 59 | + | |
51 | 60 | | |
52 | 61 | | |
53 | 62 | | |
| |||
Lines changed: 9 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2007 | 2007 | | |
2008 | 2008 | | |
2009 | 2009 | | |
| 2010 | + | |
| 2011 | + | |
| 2012 | + | |
| 2013 | + | |
| 2014 | + | |
| 2015 | + | |
| 2016 | + | |
| 2017 | + | |
| 2018 | + | |
2010 | 2019 | | |
2011 | 2020 | | |
2012 | 2021 | | |
| |||
Lines changed: 13 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3499 | 3499 | | |
3500 | 3500 | | |
3501 | 3501 | | |
3502 | | - | |
| 3502 | + | |
3503 | 3503 | | |
3504 | 3504 | | |
3505 | | - | |
3506 | | - | |
3507 | | - | |
3508 | 3505 | | |
3509 | 3506 | | |
3510 | 3507 | | |
| |||
3577 | 3574 | | |
3578 | 3575 | | |
3579 | 3576 | | |
| 3577 | + | |
| 3578 | + | |
| 3579 | + | |
| 3580 | + | |
| 3581 | + | |
| 3582 | + | |
| 3583 | + | |
| 3584 | + | |
| 3585 | + | |
| 3586 | + | |
| 3587 | + | |
| 3588 | + | |
3580 | 3589 | | |
3581 | 3590 | | |
3582 | 3591 | | |
| |||
0 commit comments