Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: read hybrid format #208

Merged
merged 7 commits into from
Aug 25, 2022
Merged

Conversation

jiacai2050
Copy link
Contributor

@jiacai2050 jiacai2050 commented Aug 19, 2022

Which issue does this PR close?

part of #77

Rationale for this change

#185 implement write capability of hybrid format, this PR implement read

What changes are included in this PR?

  • Add ParquetDecoder, to convert record batch based on StorageFormat
    • In HybridRecordDecoder, stretch non-collapsible columns according to collapsible column's length
    • Extract ListArray's nested values out, and use it instead, so upper layer could see records just like normal columnar records

Are there any user-facing changes?

No

How does this change test

Add three unit tests in encoding.rs

@waynexia waynexia added the A-analytic-engine Area: Analytic Engine label Aug 22, 2022
@jiacai2050 jiacai2050 mentioned this pull request Aug 25, 2022
@jiacai2050 jiacai2050 marked this pull request as ready for review August 25, 2022 10:45
analytic_engine/src/sst/parquet/encoding.rs Outdated Show resolved Hide resolved
analytic_engine/src/sst/parquet/encoding.rs Show resolved Hide resolved
analytic_engine/src/sst/parquet/encoding.rs Outdated Show resolved Hide resolved
analytic_engine/src/sst/parquet/encoding.rs Outdated Show resolved Hide resolved
analytic_engine/src/sst/parquet/encoding.rs Show resolved Hide resolved
analytic_engine/src/sst/parquet/encoding.rs Outdated Show resolved Hide resolved
analytic_engine/src/sst/parquet/encoding.rs Show resolved Hide resolved
analytic_engine/src/sst/parquet/encoding.rs Outdated Show resolved Hide resolved
analytic_engine/src/sst/parquet/encoding.rs Show resolved Hide resolved
analytic_engine/src/sst/parquet/encoding.rs Show resolved Hide resolved
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
analytic_engine/src/sst/parquet/encoding.rs Outdated Show resolved Hide resolved
analytic_engine/src/sst/parquet/encoding.rs Outdated Show resolved Hide resolved
analytic_engine/src/sst/parquet/encoding.rs Outdated Show resolved Hide resolved
@jiacai2050 jiacai2050 merged commit 7d3d658 into apache:main Aug 25, 2022
@jiacai2050 jiacai2050 added this to the Release v0.3 milestone Aug 26, 2022
chunshao90 pushed a commit to chunshao90/ceresdb that referenced this pull request May 15, 2023
* make arrow schema meta public

* convert fixed-length list col to columnar column

* support variable length decode

* add testcases

* add some comments

* Update analytic_engine/src/sst/parquet/encoding.rs

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>

* fix comments

Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-analytic-engine Area: Analytic Engine
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants