Skip to content

Conversation

@goutamvenkat-anyscale
Copy link
Contributor

Why are these changes needed?

Limit operator optimization. When the limit operator is set to K, ensure to limit the number of rows read per block and bound it by K.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Goutam V <goutam@anyscale.com>
@goutamvenkat-anyscale goutamvenkat-anyscale added the data Ray Data-related issues label Aug 5, 2025
@goutamvenkat-anyscale goutamvenkat-anyscale requested a review from a team as a code owner August 5, 2025 04:53
@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@goutamvenkat-anyscale goutamvenkat-anyscale added the go add ONLY when ready to merge, run all tests label Aug 5, 2025
Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V <goutam@anyscale.com>
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had
any activity for 14 days. It will be closed in another 14 days if no further activity occurs.
Thank you for your contributions.

You can always ask for help on our discussion forum or Ray's public slack channel.

If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@github-actions github-actions bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Sep 12, 2025
@goutamvenkat-anyscale goutamvenkat-anyscale removed the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Sep 15, 2025
Signed-off-by: Goutam V. <goutam@anyscale.com>
Signed-off-by: Goutam V. <goutam@anyscale.com>
Signed-off-by: Goutam V. <goutam@anyscale.com>
Comment on lines 832 to 840
existing_transform_fns = map_transformer.get_transform_fns()
new_transform_fns = existing_transform_fns + [limit_transform_fn]

# Create new transformer with the limit added
new_transformer = MapTransformer(
new_transform_fns,
init_fn=map_transformer._init_fn,
output_block_size_option_override=map_transformer._output_block_size_option_override,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Follow-up) Let's modify add_transform_fns to actually do what you're doing here instead of modifying in-place

Signed-off-by: Goutam V. <goutam@anyscale.com>
map_task_kwargs: Optional[Dict[str, Any]] = None,
ray_remote_args_fn: Optional[Callable[[], Dict[str, Any]]] = None,
ray_remote_args: Optional[Dict[str, Any]] = None,
per_block_limit: Optional[int] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
per_block_limit: Optional[int] = None,
total_num_rows_limit: Optional[int] = None,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, for the back and forth, but on a second thought now per-block limit is a misnomer (since we're not limiting per block, but for the whole output)

Please update everywhere

@alexeykudinkin alexeykudinkin merged commit 7c35b2e into ray-project:master Sep 26, 2025
6 checks passed
elliot-barn pushed a commit that referenced this pull request Sep 27, 2025
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

Limit operator optimization. When the limit operator is set to K, ensure
to limit the number of rows read per block and bound it by K.

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V. <goutam@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
@goutamvenkat-anyscale goutamvenkat-anyscale deleted the goutam/per_block_limit branch October 3, 2025 23:08
dstrodtman pushed a commit that referenced this pull request Oct 6, 2025
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

Limit operator optimization. When the limit operator is set to K, ensure
to limit the number of rows read per block and bound it by K.

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V. <goutam@anyscale.com>
Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
joshkodi pushed a commit to joshkodi/ray that referenced this pull request Oct 13, 2025
)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

Limit operator optimization. When the limit operator is set to K, ensure
to limit the number of rows read per block and bound it by K.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V. <goutam@anyscale.com>
Signed-off-by: Josh Kodi <joshkodi@gmail.com>
chenghao-guo added a commit to lance-format/lance-ray that referenced this pull request Oct 17, 2025
close #50 

Fix TypeError: LanceDatasource.get_read_tasks() got an unexpected
keyword argument 'per_task_row_limit'
due to ray-project/ray#55239 PR in ray 2.50+
version
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

Limit operator optimization. When the limit operator is set to K, ensure
to limit the number of rows read per block and bound it by K.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V. <goutam@anyscale.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

Limit operator optimization. When the limit operator is set to K, ensure
to limit the number of rows read per block and bound it by K.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V. <goutam@anyscale.com>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

Limit operator optimization. When the limit operator is set to K, ensure
to limit the number of rows read per block and bound it by K.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V. <goutam@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

Limit operator optimization. When the limit operator is set to K, ensure
to limit the number of rows read per block and bound it by K.

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: Goutam V <goutam@anyscale.com>
Signed-off-by: Goutam V. <goutam@anyscale.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Ray Data-related issues go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants