-
Notifications
You must be signed in to change notification settings - Fork 884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement RecordBatch::concat
#537
Implement RecordBatch::concat
#537
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #537 +/- ##
==========================================
+ Coverage 82.60% 82.62% +0.01%
==========================================
Files 167 167
Lines 45984 46042 +58
==========================================
+ Hits 37984 38041 +57
- Misses 8000 8001 +1 ☔ View full report in Codecov by Sentry. |
eec5c90
to
716e23c
Compare
716e23c
to
f97b9ec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @silathdiir !
I think the error handling should be more strict when the schemas don't match, but otherwise this looks great
It looks like @jorgecarleitao #461 (comment) has a suggestion to move this function to arrow::compute::concat::concat_batches
rather than in record_batch.rs
. What do you think?
Although @nevi-me seems to like |
Hi @alamb, refer to @jorgecarleitao latest comment #461 (comment), it seems that |
9bf379b
to
3a14809
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR looks good to merge from my perspective, but I am not sure if we should move the concat
function to arrow::compute
. @jorgecarleitao nd @nevi-me what do you think we should do here?
thanks @silathdiir ! |
Unless I hear different, I plan to merge this PR tomorrow (and include it in 5.0.0) |
Thanks again @silathdiir ! |
Which issue does this PR close?
Closes #461 .
Rationale for this change
As described in the issue, tries to implement
RecordBatch::concat
according to https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/physical_plan/coalesce_batches.rs#L232 .What changes are included in this PR?
Adds a new function
concat
to structRecordBatch
, and test cases.Are there any user-facing changes?
With this fix, a new
RecordBatch
could be created by concatenating multipleRecordBatch
es.