-
Notifications
You must be signed in to change notification settings - Fork 847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add FixedLengthByteArrayReader Remove ComplexObjectArrayReader #2528
Add FixedLengthByteArrayReader Remove ComplexObjectArrayReader #2528
Conversation
@tustvold Thanks for your refactor, I will review the pr and dependent pr tomorrow, it's too later for me today. |
b810ec4
to
9d8692d
Compare
9d8692d
to
f70cde3
Compare
Marking ready for review, still need to add the interval conversion, but I've run out of time today, and the meat of the PR is ready for review |
run_single_column_reader_tests::<FixedLenByteArrayType, _, RandFixedLenGen>( | ||
20, | ||
ConvertedType::NONE, | ||
None, | ||
|vals| Arc::new(converter.convert(vals.to_vec()).unwrap()), | ||
|vals| { | ||
let mut builder = FixedSizeBinaryBuilder::with_capacity(vals.len(), 20); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to use the builder so that we can specify the length, otherwise empty pages result in errors
https://github.com/apache/arrow-rs/runs/7969684590?check_suite_focus=true
@Ted-Jiang @thinkharderdev perhaps one of you might be able to give this a review? |
…ject-array-reader
cool! @tustvold thanks, this seems a huge pr, may take some times to review 😄 I will try finish today. |
@@ -47,7 +47,6 @@ mod test_util; | |||
pub use builder::build_array_reader; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to public the make_fixed_len_byte_array_reader
like build_array_reader
for byte_array.
Benchmark and other code will use this method to create reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a historical quirk, I intend to make this module private soon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But how to create reader in benchmark?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean ideally you'd use the normal public API, but perhaps I'm getting ahead of myself...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LTGM
Just one comment for the public API
It's ready to use the refactor code to re-run the benchmark with the #2529
V: ValuesBuffer, | ||
CV: ColumnValueDecoder<Slice = V::Slice>, | ||
{ | ||
pub fn new_with_records(desc: ColumnDescPtr, records: V) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should use pub (crate)
here 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RecordReader itself is crate private
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, got it.
} | ||
} | ||
|
||
struct ValueDecoder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Benchmark runs are scheduled for baseline = 903d892 and contender = 8e4a455. 8e4a455 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Draft as is still missing some minor functionality
Which issue does this PR close?
Closes #1661
Closes #1698
Closes #2253
Closes #2318
Rationale for this change
See tickets, in short ComplexObjectArrayReader doesn't understand nesting, is slow, and can finally be removed
What changes are included in this PR?
Adds a FixedLengthByteArrayReader, and uses it to remove the last usages of ComplexObjectArrayReader
Are there any user-facing changes?