Skip to content

Commit 3a5207e

Browse files
committed
PARQUET-372: Add comment to explain not truncating values.
1 parent 2a47d2b commit 3a5207e

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -237,6 +237,10 @@ public Encoding getEncoding(org.apache.parquet.column.Encoding encoding) {
237237
public static Statistics toParquetStatistics(
238238
org.apache.parquet.column.statistics.Statistics statistics) {
239239
Statistics stats = new Statistics();
240+
// Don't write stats larger than the max size rather than truncating. The
241+
// rationale is that some engines may use the minimum value in the page as
242+
// the true minimum for aggregations and there is no way to mark that a
243+
// value has been truncated and is a lower bound and not in the page.
240244
if (!statistics.isEmpty() && statistics.isSmallerThan(MAX_STATS_SIZE)) {
241245
stats.setNull_count(statistics.getNumNulls());
242246
if (statistics.hasNonNullValue()) {

0 commit comments

Comments
 (0)