-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: avoid writing statistics for binary columns to fix JSON error #1498
Conversation
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
6cdcb67
to
070c49b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is a bit messy, but your changes make sense. :/
I think you could accomplish the same thing in a more targeted way by removing and binary columns using collect_stats_conversion
and apply_stats_conversion
. Use collect_stats_conversion
to get which columns are binary, and then apply to remove them from the value (can use serde_json::Value::remove()
). LMK if that works instead.
Unfortunately you can't do this with stats conversion, the error occurs when making the ReaderBuilder, not when applying the values. |
rust/src/writer/stats.rs
Outdated
let str_val = val.to_string(); | ||
let decimal_string = if str_val.len() > *scale as usize { | ||
let (integer_part, fractional_part) = | ||
str_val.split_at(str_val.len() - *scale as usize); | ||
format!("{}.{}", integer_part, fractional_part) | ||
} else { | ||
format!("0.{}", str_val) | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also want to have negative scales. Also, could we unit test this directly in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added support for negative scales. Not going to be able to add tests for a bit. I added you to my fork so you can commit if you want to take a crack at it.
7fa16f5
to
3877770
Compare
We have a new failure for decimal, but I'm a little stumped on it for now. |
I think for decimal, we might be able to start using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks promising to me, I'm looking forward to this being merged
Co-authored-by: R. Tyler Croy <rtyler@brokenco.de>
@rtyler - i fear it may have been a bit too early to merge this., since the rust test are failing for this. since 0.13 was just released, we should look into releasing a fix asap. bot sure if @wjones127 already knows whats going on? Otherwise I might look into it once available again. |
@roeap Yes, I'm looking at this now. I must have been moving too quickly through some PRs. I now see that the test was failing but I'm not sure why I merged this 🤦 I must have mistaken this pull request for another one which I had open at the time. I am going to land a revert into |
…error (delta-io#1498)" This reverts commit 312d1c2. I should not have merged this, I must have mistaken the red ❌ pull request delta-io#1498 for something else when I merged. This backs out that commit
Description
Avoid writing statistics for binary columns to fix JSON error thrown by Arrow
Related Issue(s)