-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix RowWriter
index out of bounds error
#2968
Conversation
@yjshen please check this PR draft, as I noticed the row_writer logic is mostly done by you |
Codecov Report
@@ Coverage Diff @@
## master #2968 +/- ##
=======================================
Coverage 85.75% 85.76%
=======================================
Files 281 281
Lines 51494 51511 +17
=======================================
+ Hits 44161 44179 +18
+ Misses 7333 7332 -1
Help us with your feedback. Take ten seconds to tell us how you rate us. |
Hi, I'm the one reported the bug. If it is necessary for the resizing logic to decide on the basis of length, not the capacity, of the I wonder if |
Per my note on #2963, this PR fixes that issue. |
@yjshen PTAL |
I'm aware of this PR but ran out of time today. Will come back tomorrow morning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
datafusion/core/src/dataframe.rs
Outdated
// let df = ctx.sql("SELECT * FROM test").await.unwrap(); | ||
// df.show_limit(10).await.unwrap(); | ||
// dbg!(df.schema()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we leave it in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd better remove this I suppose.
RowWriter
index out of bounds error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @comphead!
I propose we fix the followings:
- The
new_width > to.data.len()
change makes sense to me, but on the resize part:
if new_width > to.data.len() {
to.data.resize(max(to.data.capacity(), new_width), 0);
}
I think we could just resize to capacity to avoid reallocating, instead of the previous double capacity way.
-
write_field_binary
should be adapted accordingly as well. -
The capacity() bug also exists in the
end_padding
logic, it should be:
/// End each row at 8-byte word boundary.
pub(crate) fn end_padding(&mut self) {
let payload_width = self.current_width();
self.row_width = round_upto_power_of_2(payload_width, 8);
if self.data.len() < self.row_width {
self.data.resize(self.row_width, 0);
}
}
@liukun4515 I cannot quite get your +8
logic while calculating size in the comments above, I didn't get panic with your 110 empty string case either. Please correct me if I have missed something important.
@@ -269,29 +269,29 @@ impl RowWriter { | |||
|
|||
/// Stitch attributes of tuple in `batch` at `row_idx` and returns the tuple width | |||
pub fn write_row( | |||
row: &mut RowWriter, | |||
row_writer: &mut RowWriter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This renaming is unrelated, I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought its better naming, because when reading the code I supposed the row
is either incoming row or row buffer, both are not correct. row_writer better reflects the object imho.
datafusion/core/src/dataframe.rs
Outdated
// let df = ctx.sql("SELECT * FROM test").await.unwrap(); | ||
// df.show_limit(10).await.unwrap(); | ||
// dbg!(df.schema()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd better remove this I suppose.
Sorry for the error comments with my wrong understanding for row layout. |
datafusion/row/src/writer.rs
Outdated
@@ -349,9 +349,9 @@ pub(crate) fn write_field_utf8( | |||
let from = from.as_any().downcast_ref::<StringArray>().unwrap(); | |||
let s = from.value(row_idx); | |||
let new_width = to.current_width() + s.as_bytes().len(); | |||
if new_width > to.data.capacity() { | |||
if new_width > to.data.len() { | |||
// double the capacity to avoid repeated resize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outdated code comment
datafusion/row/src/writer.rs
Outdated
@@ -365,9 +365,9 @@ pub(crate) fn write_field_binary( | |||
let from = from.as_any().downcast_ref::<BinaryArray>().unwrap(); | |||
let s = from.value(row_idx); | |||
let new_width = to.current_width() + s.len(); | |||
if new_width > to.data.capacity() { | |||
if new_width > to.data.len() { | |||
// double the capacity to avoid repeated resize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
@alamb want to take another look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great -- thank you all for your help and attention to this matter
Benchmark runs are scheduled for baseline = 176f432 and contender = 811bad4. 811bad4 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #2910 #2963 .
Rationale for this change
The weird bug was reported, it can be reproduced locally in
row_writer_resize_test
, the reason being when the column is not nullable then 1 extra byte added to initial offset, and this breaks the buffer resize logic, causing out of range issues onself.data[varlena_offset..varlena_offset + size].copy_from_slice(bytes);
What changes are included in this PR?
Are there any user-facing changes?