Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

implemented futures::Sink for parquet async writer #877

Merged
merged 30 commits into from
Mar 3, 2022

Conversation

sydduckworth
Copy link
Contributor

Parquet component of this PR: #876.

Adds io::parquet::write::FileSink which implements futures::Sink.

Removes io::parquet::write::FileStreamer.

Dexter Duckworth and others added 30 commits February 17, 2022 16:34
When pre-calculating the null_count on a Bitmap we need to start from
the correct place in the underlying byte-array (i.e. take into account
that we may already be looking at a slice).  Currently, when we slice of
a small part (so that we enter the first branch of the null_count
choice), the null_count assumes that the current offset is 0, but it
should not.

This adds a test for this situation and fixes the issue.
@codecov
Copy link

codecov bot commented Mar 2, 2022

Codecov Report

Merging #877 (1f0fcbf) into main (eb4bc5d) will increase coverage by 0.17%.
The diff coverage is 69.86%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #877      +/-   ##
==========================================
+ Coverage   71.50%   71.67%   +0.17%     
==========================================
  Files         335      335              
  Lines       18147    18206      +59     
==========================================
+ Hits        12976    13050      +74     
+ Misses       5171     5156      -15     
Impacted Files Coverage Δ
src/io/parquet/write/mod.rs 61.08% <ø> (ø)
src/io/parquet/write/sink.rs 69.86% <69.86%> (ø)
src/compute/arithmetics/time.rs 25.68% <0.00%> (-0.92%) ⬇️
src/bitmap/utils/slice_iterator.rs 87.93% <0.00%> (+1.72%) ⬆️
src/io/parquet/write/file.rs 76.00% <0.00%> (+8.00%) ⬆️
src/io/parquet/read/row_group.rs 98.55% <0.00%> (+24.30%) ⬆️
src/io/parquet/read/mod.rs 100.00% <0.00%> (+33.33%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eb4bc5d...1f0fcbf. Read the comment docs.

@jorgecarleitao jorgecarleitao added the enhancement An improvement to an existing feature label Mar 3, 2022
@jorgecarleitao jorgecarleitao changed the title Parquet sink interface implemented futures::Sink for parquet async writer Mar 3, 2022
@jorgecarleitao jorgecarleitao merged commit b9eae79 into jorgecarleitao:main Mar 3, 2022
@jorgecarleitao jorgecarleitao added feature A new feature and removed enhancement An improvement to an existing feature labels Mar 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature A new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants