Skip to content

Support parquet columnar format in the aws_s3 sink #1374

@binarylogic

Description

@binarylogic

Similar to #1373 we should support the Parquet format. The parquet format is a columnar format that enables faster and more efficient data access schemes such as column selection and indexing.

Implementation

Unfortunately, I do not have deep experience with this format as I do with ORC, but like everything else, we should start very simple. Fortunately, there appears to be a Rust library that supports basic writing of this data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    domain: codecsAnything related to Vector's codecs (encoding/decoding)needs: approvalNeeds review & approval before work can begin.needs: requirementsNeeds a a list of requirements before work can be beginsink: aws_s3Anything `aws_s3` sink relatedtype: enhancementA value-adding code change that enhances its existing functionality.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions