Commit 55bfff4
[SPARK-40819][SQL][3.2] Timestamp nanos behaviour regression
As per HyukjinKwon request on apache#38312 to backport fix into 3.2
### What changes were proposed in this pull request?
Handle `TimeUnit.NANOS` for parquet `Timestamps` addressing a regression in behaviour since 3.2
### Why are the changes needed?
Since version 3.2 reading parquet files that contain attributes with type `TIMESTAMP(NANOS,true)` is not possible as ParquetSchemaConverter returns
```
Caused by: org.apache.spark.sql.AnalysisException: Illegal Parquet type: INT64 (TIMESTAMP(NANOS,true))
```
https://issues.apache.org/jira/browse/SPARK-34661 introduced a change matching on the `LogicalTypeAnnotation` which only covers Timestamp cases for `TimeUnit.MILLIS` and `TimeUnit.MICROS` meaning `TimeUnit.NANOS` would return `illegalType()`
Prior to 3.2 the matching used the `originalType` which for `TIMESTAMP(NANOS,true)` return `null` and therefore resulted to a `LongType`, the change proposed is too consider `TimeUnit.NANOS` and return `LongType` making behaviour the same as before.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Added unit test covering this scenario.
Internally deployed to read parquet files that contain `TIMESTAMP(NANOS,true)`
Closes apache#39905 from awdavidson/ts-nanos-fix-3.2.
Authored-by: alfreddavidson <alfie.davidson9@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>1 parent ed12efa commit 55bfff4
File tree
7 files changed
+82
-11
lines changed- sql
- catalyst/src/main/scala/org/apache/spark/sql/internal
- core/src
- main
- java/org/apache/spark/sql/execution/datasources/parquet
- scala/org/apache/spark/sql/execution/datasources
- parquet
- v2/parquet
- test
- resources/test-data
- scala/org/apache/spark/sql/execution/datasources/parquet
7 files changed
+82
-11
lines changedLines changed: 9 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3228 | 3228 | | |
3229 | 3229 | | |
3230 | 3230 | | |
| 3231 | + | |
| 3232 | + | |
| 3233 | + | |
| 3234 | + | |
| 3235 | + | |
| 3236 | + | |
| 3237 | + | |
3231 | 3238 | | |
3232 | 3239 | | |
3233 | 3240 | | |
| |||
4241 | 4248 | | |
4242 | 4249 | | |
4243 | 4250 | | |
| 4251 | + | |
| 4252 | + | |
4244 | 4253 | | |
4245 | 4254 | | |
4246 | 4255 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
161 | 161 | | |
162 | 162 | | |
163 | 163 | | |
| 164 | + | |
164 | 165 | | |
165 | 166 | | |
166 | 167 | | |
| |||
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
Lines changed: 12 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
120 | 120 | | |
121 | 121 | | |
122 | 122 | | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
123 | 127 | | |
124 | 128 | | |
125 | 129 | | |
| |||
244 | 248 | | |
245 | 249 | | |
246 | 250 | | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
247 | 254 | | |
248 | 255 | | |
249 | 256 | | |
| |||
437 | 444 | | |
438 | 445 | | |
439 | 446 | | |
440 | | - | |
| 447 | + | |
| 448 | + | |
441 | 449 | | |
442 | 450 | | |
443 | 451 | | |
| |||
533 | 541 | | |
534 | 542 | | |
535 | 543 | | |
| 544 | + | |
536 | 545 | | |
537 | 546 | | |
538 | 547 | | |
539 | 548 | | |
540 | 549 | | |
541 | | - | |
| 550 | + | |
| 551 | + | |
542 | 552 | | |
543 | 553 | | |
544 | 554 | | |
| |||
Lines changed: 12 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| 46 | + | |
46 | 47 | | |
47 | 48 | | |
48 | 49 | | |
49 | | - | |
| 50 | + | |
| 51 | + | |
50 | 52 | | |
51 | 53 | | |
52 | 54 | | |
53 | | - | |
| 55 | + | |
| 56 | + | |
54 | 57 | | |
55 | 58 | | |
56 | 59 | | |
57 | | - | |
| 60 | + | |
| 61 | + | |
58 | 62 | | |
59 | 63 | | |
60 | 64 | | |
| |||
243 | 247 | | |
244 | 248 | | |
245 | 249 | | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
246 | 255 | | |
247 | 256 | | |
248 | 257 | | |
| |||
Lines changed: 4 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
79 | 83 | | |
80 | 84 | | |
81 | 85 | | |
| |||
Binary file not shown.
Lines changed: 44 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
31 | 32 | | |
32 | 33 | | |
33 | 34 | | |
| |||
42 | 43 | | |
43 | 44 | | |
44 | 45 | | |
45 | | - | |
| 46 | + | |
| 47 | + | |
46 | 48 | | |
47 | 49 | | |
48 | 50 | | |
49 | 51 | | |
50 | 52 | | |
51 | 53 | | |
52 | | - | |
| 54 | + | |
| 55 | + | |
53 | 56 | | |
54 | 57 | | |
55 | 58 | | |
56 | 59 | | |
57 | 60 | | |
58 | 61 | | |
59 | 62 | | |
60 | | - | |
| 63 | + | |
| 64 | + | |
61 | 65 | | |
62 | 66 | | |
63 | | - | |
| 67 | + | |
| 68 | + | |
64 | 69 | | |
65 | 70 | | |
66 | 71 | | |
| |||
101 | 106 | | |
102 | 107 | | |
103 | 108 | | |
104 | | - | |
| 109 | + | |
| 110 | + | |
105 | 111 | | |
106 | 112 | | |
107 | 113 | | |
| |||
115 | 121 | | |
116 | 122 | | |
117 | 123 | | |
118 | | - | |
| 124 | + | |
| 125 | + | |
119 | 126 | | |
120 | 127 | | |
121 | 128 | | |
122 | 129 | | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
123 | 143 | | |
124 | 144 | | |
125 | 145 | | |
| |||
458 | 478 | | |
459 | 479 | | |
460 | 480 | | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
461 | 499 | | |
462 | 500 | | |
463 | 501 | | |
| |||
0 commit comments