Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Data] Fix
OutputBlockBuffer
to avoid repeatedly copying remainder …
…block (ray-project#48266) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? Currently, inside `OutputBlockBuffer` we're 1. Repeatedly copying remainder of the original block, bringing total # of bytes copied to O(N^2) (where N is the size of the original block) 2. Creating potentially very large blocks (like in ray-project#48236) that could overflow underlying Arrow data types. This change addresses both of these issues, by establishing following protocol where 1. Finalized target blocks *are* copied, while 2. Remainder block is NOT (therefore continuing referencing original block) Addresses ray-project#48236 <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number <!-- For example: "Closes ray-project#1234" --> ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
- Loading branch information