-
Notifications
You must be signed in to change notification settings - Fork 847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change ArrayDataBuilder::null_bit_buffer
to accept Option<Buffer>
rather than Buffer
#1739
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1739 +/- ##
==========================================
- Coverage 83.36% 83.35% -0.01%
==========================================
Files 196 196
Lines 56147 56122 -25
==========================================
- Hits 46805 46782 -23
+ Misses 9342 9340 -2
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this, but as this is a breaking change should probably get feedback from others.
Left a suggestion on how to make the null_count > 0
construction that appears in various places less verbose
arrow/src/array/builder.rs
Outdated
builder = builder.null_bit_buffer(null_bit_buffer); | ||
} | ||
.add_buffer(self.values_builder.finish()) | ||
.null_bit_buffer(if null_count > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be written as (null_count > 0).then(|| null_bit_buffer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated!
Signed-off-by: remzi <13716567376yh@gmail.com>
Signed-off-by: remzi <13716567376yh@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks like a very nice change to me. While it will result in some code churn I think the API is low enough level it should not be super widely used and the churn will result in cleaner code downstream 👍
Thank you @HaoYang670
cc @jhorstmann (whose project I think uses these lower level APIs)
arrow/src/compute/kernels/string.rs
Outdated
@@ -74,15 +74,11 @@ pub fn string_concat<Offset: OffsetSizeTrait>( | |||
output_offsets.append(Offset::from_usize(output_values.len()).unwrap()); | |||
} | |||
|
|||
let mut builder = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this code is moved, this PR now has a conflict sadly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated!
@@ -287,24 +287,22 @@ fn create_primitive_array( | |||
let array_data = match data_type { | |||
Utf8 | Binary | LargeBinary | LargeUtf8 => { | |||
// read 3 buffers | |||
let mut builder = ArrayData::builder(data_type.clone()) | |||
ArrayData::builder(data_type.clone()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 the new pattern certainly look nicer in my opinion
Fully agree, this looks much more fluent than before. |
builder = builder.null_bit_buffer(null_bit_buffer.unwrap()); | ||
} | ||
.add_buffer(self.values_builder.finish()) | ||
.null_bit_buffer(if null_count > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed a .then opportunity here FWIW. Same for line below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.then
is not suitable here because the type of null_bit_buffer
is Option<Buffer>
. Using .then
will get a result of Option<Option<Buffer>>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously this called .unwrap()
which is probably safe given the null count, although I can't remember what NullArrays do. Either way you could then just call .flatten()
. Not a big deal 😁
ArrayDataBuilder::null_bit_buffer
ArrayDataBuilder::null_bit_buffer
to accept Option<Buffer>
rather than Buffer
Signed-off-by: remzi 13716567376yh@gmail.com
Which issue does this PR close?
Closes #1737.
Rationale for this change
null_bit_buffer
, so that users knownull_bit_buffer
is an optional field.Instead, we can directly write like this now:
mut
s are safer. Less code is better.What changes are included in this PR?
Change the type of
buf
fromBuffer
toOption<Buffer>
Are there any user-facing changes?
Yes!