Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Added support to read and write Decimal128 to Avro #837

Merged
merged 5 commits into from
Feb 15, 2022
Merged

Added support to read and write Decimal128 to Avro #837

merged 5 commits into from
Feb 15, 2022

Conversation

potter420
Copy link
Contributor

Hi, this is to partially address #733 .

Added read of Decimal128 from both fixed and bytes schema.

Support writing to bytes schema to avro.

@codecov
Copy link

codecov bot commented Feb 15, 2022

Codecov Report

Merging #837 (ce2a5a7) into main (9e6924b) will decrease coverage by 0.12%.
The diff coverage is 29.62%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #837      +/-   ##
==========================================
- Coverage   71.55%   71.43%   -0.13%     
==========================================
  Files         327      327              
  Lines       17727    17782      +55     
==========================================
+ Hits        12684    12702      +18     
- Misses       5043     5080      +37     
Impacted Files Coverage Δ
src/io/avro/write/schema.rs 66.66% <0.00%> (-1.91%) ⬇️
src/io/avro/write/serialize.rs 74.28% <0.00%> (-10.13%) ⬇️
src/io/avro/read/deserialize.rs 69.91% <50.00%> (-3.13%) ⬇️
src/bitmap/utils/slice_iterator.rs 86.20% <0.00%> (-1.73%) ⬇️
src/io/parquet/read/dictionary.rs 60.46% <0.00%> (-0.72%) ⬇️
src/bitmap/immutable.rs 87.80% <0.00%> (ø)
src/buffer/immutable.rs 96.42% <0.00%> (ø)
src/array/utf8/mutable.rs 81.03% <0.00%> (ø)
src/io/avro/read/schema.rs 73.41% <0.00%> (+3.79%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9e6924b...ce2a5a7. Read the comment docs.

Copy link
Owner

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, @potter420 . Would it be possible to add 2 tests, one in the read where we show that we can read from avro-rs, and one in write, to show that we can roundtrip decimals?

@potter420
Copy link
Contributor Author

potter420 commented Feb 15, 2022

@jorgecarleitao For some reason I can't pass the ipc and parquet test locally, despite running write_parquet.py.
I used this for the write part using the following:

let len = (x.leading_zeros() / 8) as usize;
util::zigzag_encode((16 - len) as i64, buf).unwrap();
buf.extend_from_slice(&x.to_be_bytes()[len..]);

which is wrong. I fixed it by

let len = ((x.leading_zeros() / 8) - ((x.leading_zeros() / 8) % 2)) as usize;

Did some round trip with spark, the result look fine. The problem is, I can't find a way to stabilize the file name. So for the CI, we have to find the file name.

@jorgecarleitao jorgecarleitao merged commit 618a650 into jorgecarleitao:main Feb 15, 2022
@jorgecarleitao jorgecarleitao changed the title Add Decimal128 to avro round trip (Read/Write) Added support to read and write Decimal128 to Avro Feb 15, 2022
@jorgecarleitao jorgecarleitao added the feature A new feature label Feb 15, 2022
@jorgecarleitao
Copy link
Owner

Flawless. Thanks a lot, @potter420 !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature A new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants