Skip to content

Conversation

@iamjustinhsu
Copy link
Contributor

@iamjustinhsu iamjustinhsu commented Nov 26, 2025

Description

Context: #58694 (comment)
PR that changed this behavior: https://github.com/ray-project/ray/pull/52806/files

Before the PR, the BlockRefBundler can silently drop bundles on finalize, but due to the way we use it*, it is not possible. However, I still think we should fix it to make it more explicit. I added a break statement when the bundle target size is met.

*Context: The current behavior of TaskPoolMapOperator and ActorPoolMapOperator is that a bundle is queued and will eagerly try to launch a task with the bundled input. The bundled input will always contain all the existing bundles in BlockRefBundler due to the current behavior above(you can think of it as the BlockRefBundler doesn't have time to store a backlog of bundles, because once it has a ready bundle it is launched). SO there are never remainders (see code) remaining, and hence it never reaches the else statement on line map_operator.py:750 in BlockRefBundler.

Related issues

Additional information

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>
@iamjustinhsu iamjustinhsu requested a review from a team as a code owner November 26, 2025 23:00
output_buffer_size += bundle_size
else:
remainder = self._bundle_buffer[idx:]
break
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i remove the break statement, the test fails below

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug in BlockRefBundler where bundles could be silently dropped during finalization. The fix involves adding a break statement in get_next_bundle to correctly handle bundle remainders. A new test case, test_block_ref_bundler_finalize_drains_all, has been added to verify that all buffered data is drained correctly upon finalization, preventing regressions. The changes are correct and the added test is comprehensive for the scenario it covers. The pull request is well-structured and effectively resolves the identified issue.

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>
Signed-off-by: iamjustinhsu <jhsu@anyscale.com>
@ray-gardener ray-gardener bot added the data Ray Data-related issues label Nov 27, 2025
self._block_ref_bundler.add_bundle(refs)
self._metrics.on_input_queued(refs)

if self._block_ref_bundler.has_bundle():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also so use while here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to change it (in this PR at least), because we would launch 1 task per available bundle. So this can change the behavior of execution

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@owenowenisme i'm gonna to solve this issue in this big PR #59093

@iamjustinhsu iamjustinhsu deleted the jhsu/drain-all-inputs-of-block-ref-bundlers branch December 2, 2025 23:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Ray Data-related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants