-
Notifications
You must be signed in to change notification settings - Fork 7k
[data] Use ranges in test_operator_fusion.py
#58000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[data] Use ranges in test_operator_fusion.py
#58000
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors two tests in test_operator_fusion.py to use ray.data.range instead of ray.data.read_parquet. This is a great improvement as it makes the tests more robust and less brittle by removing the dependency on file I/O. The changes are correct and align with the goal of improving test quality. I've added a couple of minor suggestions to improve variable naming for better readability. Additionally, as a minor follow-up, the temp_dir fixture is now unused in test_read_with_map_batches_fused_successfully and test_map_batches_with_batch_size_specified_fusion and could be removed from their signatures for better code clarity.
| n = 10 | ||
| ds = ray.data.range(n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| n = 10 | ||
| ds = ray.data.range(n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| n = 10 | ||
| ds = ray.data.range(n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Minimally passing tests is a good practice because it can make the intent of the test more clear.
| n = 10 | |
| ds = ray.data.range(n) | |
| ds = ray.data.range(1) |
## Description We are using `read_parquet` in two of our tests in `test_operator_fusion.py`, this switches those to use `range` to make the tests less brittle. Signed-off-by: Matthew Owen <mowen@anyscale.com> Signed-off-by: xgui <xgui@anyscale.com>
## Description We are using `read_parquet` in two of our tests in `test_operator_fusion.py`, this switches those to use `range` to make the tests less brittle. Signed-off-by: Matthew Owen <mowen@anyscale.com>
## Description We are using `read_parquet` in two of our tests in `test_operator_fusion.py`, this switches those to use `range` to make the tests less brittle. Signed-off-by: Matthew Owen <mowen@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
## Description We are using `read_parquet` in two of our tests in `test_operator_fusion.py`, this switches those to use `range` to make the tests less brittle. Signed-off-by: Matthew Owen <mowen@anyscale.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
Description
We are using
read_parquetin two of our tests intest_operator_fusion.py, this switches those to userangeto make the tests less brittle.Related issues
Additional information