forked from cockroachdb/cockroach
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
changefeedccl: implement parquet writer library and tests
This change implements a `ParquetWriter` struct in the changefeedccl package with the following public APIs: ``` NewCDCParquetWriterFromRow(row cdcevent.Row, sink io.Writer) (*ParquetWriter, error) (w *ParquetWriter) AddData(updatedRow cdcevent.Row, prevRow cdcevent.Row) error (w *ParquetWriter) Close() error (w *ParquetWriter) CurrentSize() int64 ``` This parquet writer takes rows in the form of `cdcevent.Row` and writes them to the `io.Writer` sink using parquet version v2.6. The writer implements several features internally required to write in the parquet format: - schema creation - row group / column page management - encoding/decoding of CRDB datums to parquet datums Currently, the writer only supports types found in the TPCC workload, namely INT, DECIMAL, STRING UUID, TIMESTAMP and BOOL. This change also adds tests for the `ParquetWriter`. These tests write datums from CRDB tables to parquet files and read back these datums using an internal parquet reader. The tests verify that the parquet writer is correct by asserting that the datums match. Informs: cockroachdb#99028 Epic: None Release note: None
- Loading branch information
1 parent
6d51df7
commit cffa798
Showing
5 changed files
with
807 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.