Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions docs/sql-data-sources-orc.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,24 @@ When reading from Hive metastore ORC tables and inserting to Hive metastore ORC
</td>
<td>2.3.0</td>
</tr>
<tr>
<td><code>spark.sql.orc.columnarReaderBatchSize</code></td>
<td><code>4096</code></td>
<td>
The number of rows to include in an orc vectorized reader batch. The number should
be carefully chosen to minimize overhead and avoid OOMs in reading data.
</td>
<td>2.4.0</td>
</tr>
<tr>
<td><code>spark.sql.orc.columnarWriterBatchSize</code></td>
<td><code>1024</code></td>
<td>
The number of rows to include in an orc vectorized writer batch. The number should
be carefully chosen to minimize overhead and avoid OOMs in writing data.
</td>
<td>3.4.0</td>
</tr>
<tr>
<td><code>spark.sql.orc.enableNestedColumnVectorizedReader</code></td>
<td><code>false</code></td>
Expand All @@ -163,6 +181,25 @@ When reading from Hive metastore ORC tables and inserting to Hive metastore ORC
</td>
<td>3.2.0</td>
</tr>
<tr>
<td><code>spark.sql.orc.filterPushdown</code></td>
<td><code>true</code></td>
<td>
When true, enable filter pushdown for ORC files.
</td>
<td>1.4.0</td>
</tr>
<tr>
<td><code>spark.sql.orc.aggregatePushdown</code></td>
<td><code>false</code></td>
<td>
If true, aggregates will be pushed down to ORC for optimization. Support MIN, MAX and
COUNT as aggregate expression. For MIN/MAX, support boolean, integer, float and date
type. For COUNT, support all data types. If statistics is missing from any ORC file
footer, exception would be thrown.
</td>
<td>3.3.0</td>
</tr>
<tr>
<td><code>spark.sql.orc.mergeSchema</code></td>
<td>false</td>
Expand Down