Commit a988aaf
[SPARK-29454][SQL] Reduce unsafeProjection times when read Parquet file use non-vectorized mode
### What changes were proposed in this pull request?
There will be 2 times unsafeProjection convert operation When we read a Parquet data file use non-vectorized mode:
1. `ParquetGroupConverter` call unsafeProjection function to covert `SpecificInternalRow` to `UnsafeRow` every times when read Parquet data file use `ParquetRecordReader`.
2. `ParquetFileFormat` will call unsafeProjection function to covert this `UnsafeRow` to another `UnsafeRow` again when partitionSchema is not empty in DataSourceV1 branch, and `PartitionReaderWithPartitionValues` will always do this convert operation in DataSourceV2 branch.
In this pr, remove `unsafeProjection` convert operation in `ParquetGroupConverter` and change `ParquetRecordReader` to produce `SpecificInternalRow` instead of `UnsafeRow`.
### Why are the changes needed?
The first time convert in `ParquetGroupConverter` is redundant and `ParquetRecordReader` return a `InternalRow(SpecificInternalRow)` is enough.
### Does this PR introduce any user-facing change?
No.
### How was this patch tested?
Unit Test
Closes #26106 from LuciferYang/spark-parquet-unsafe-projection.
Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>1 parent 857f109 commit a988aaf
File tree
5 files changed
+23
-30
lines changed- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources
- parquet
- v2/parquet
5 files changed
+23
-30
lines changedsql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
Lines changed: 8 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
328 | 328 | | |
329 | 329 | | |
330 | 330 | | |
331 | | - | |
| 331 | + | |
332 | 332 | | |
333 | 333 | | |
334 | 334 | | |
335 | | - | |
| 335 | + | |
336 | 336 | | |
337 | | - | |
| 337 | + | |
338 | 338 | | |
339 | | - | |
| 339 | + | |
340 | 340 | | |
341 | 341 | | |
342 | 342 | | |
343 | 343 | | |
344 | 344 | | |
345 | | - | |
346 | | - | |
| 345 | + | |
347 | 346 | | |
348 | | - | |
349 | | - | |
350 | | - | |
351 | 347 | | |
352 | 348 | | |
353 | | - | |
| 349 | + | |
354 | 350 | | |
355 | | - | |
356 | | - | |
| 351 | + | |
| 352 | + | |
357 | 353 | | |
358 | 354 | | |
359 | 355 | | |
| |||
Lines changed: 5 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | | - | |
| 38 | + | |
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| |||
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
54 | | - | |
| 54 | + | |
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
| |||
114 | 114 | | |
115 | 115 | | |
116 | 116 | | |
117 | | - | |
| 117 | + | |
118 | 118 | | |
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
122 | 122 | | |
123 | | - | |
| 123 | + | |
124 | 124 | | |
125 | 125 | | |
126 | 126 | | |
| |||
Lines changed: 3 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | | - | |
| 40 | + | |
41 | 41 | | |
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
45 | | - | |
| 45 | + | |
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
Lines changed: 2 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
173 | 173 | | |
174 | 174 | | |
175 | 175 | | |
176 | | - | |
177 | | - | |
178 | 176 | | |
179 | | - | |
| 177 | + | |
180 | 178 | | |
181 | | - | |
| 179 | + | |
182 | 180 | | |
183 | 181 | | |
184 | 182 | | |
| |||
Lines changed: 5 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | | - | |
35 | 34 | | |
36 | 35 | | |
37 | 36 | | |
| |||
176 | 175 | | |
177 | 176 | | |
178 | 177 | | |
179 | | - | |
| 178 | + | |
180 | 179 | | |
181 | 180 | | |
182 | 181 | | |
| |||
185 | 184 | | |
186 | 185 | | |
187 | 186 | | |
188 | | - | |
| 187 | + | |
189 | 188 | | |
190 | 189 | | |
191 | | - | |
| 190 | + | |
192 | 191 | | |
193 | 192 | | |
194 | 193 | | |
195 | | - | |
| 194 | + | |
196 | 195 | | |
197 | | - | |
| 196 | + | |
198 | 197 | | |
199 | 198 | | |
200 | 199 | | |
| |||
0 commit comments