-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Open
Description
After PARQUET-160 was resolved, ColumnChunkPageWriter started using ConcatenatingByteArrayCollector. There are all data is collected in the List of byte[], before writing the page. No way to use direct memory for allocating buffers. ByteBufferAllocator is present in the ColumnChunkPageWriter class, but never used.
Using of java heap space in some cases can cause OOM exceptions or GC's overhead.
ByteBufferAllocator should be used in the ConcatenatingByteArrayCollector or OutputStream classes.
Reporter: Vitalii Diravka / @vdiravka
Assignee: Vitalii Diravka / @vdiravka
Related issues:
- Error: SYSTEM ERROR: RuntimeException: Unknown logical type <LogicalType UUID:UUIDType()> (Is contained by)
- Replace ParquetColumnChunkPageWriter with original Parquet class (Is contained by)
- Out of heap running CTAS against text delimited (relates to)
- Support configurable for DirectByteBufferAllocator from Hadoop Configuration (is related to)
- Improvements in ByteBuffer read path (is related to)
- Simplify CapacityByteArrayOutputStream (is related to)
Note: This issue was originally created as PARQUET-1006. Please see the migration documentation for further details.