Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON decoder data corruption for large i64/u64 #653

Closed
xianwill opened this issue Aug 2, 2021 · 0 comments · Fixed by #652
Closed

JSON decoder data corruption for large i64/u64 #653

xianwill opened this issue Aug 2, 2021 · 0 comments · Fixed by #652
Labels

Comments

@xianwill
Copy link
Contributor

xianwill commented Aug 2, 2021

Describe the bug
Large values for i64 and u64 types are corrupted by the cast to f64 back to i64/u64 in the json decoder build_primitive_array method.

To Reproduce
Pass a large i64 value through the decoder as demonstrated in this commit. The converted value will be slightly smaller. In this example to create a breaking test, I passed 1627668684594000000 and the resulting value came out as 1627668684593999872 - a difference of 128.

Expected behavior
The converted value should match the value passed to the decoder. In this case, the value in the created record batch should be 1627668684594000000.

Additional context
I found this bug while implementing timestamp support in kafka-delta-ingest and delta-rs. Valid nanosecond timestamps are on the critical path for us there. Also, I have an arrow-rs PR in place already to fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant