-
Notifications
You must be signed in to change notification settings - Fork 7k
[Data] [2/n] - Update Project operator to use Expressions #57855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Data] [2/n] - Update Project operator to use Expressions #57855
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a significant and positive refactoring that updates the Project operator to use a more general and powerful expression-based API. This simplifies the logical plan and enables more robust optimizations, as shown by the rewrite of the projection pushdown rule. The addition of comprehensive tests for the new expression logic is also a great improvement. I've found one critical issue with the count() implementation and a couple of medium-severity suggestions for performance and code style.
python/ray/data/_internal/planner/plan_expression/expression_evaluator.py
Outdated
Show resolved
Hide resolved
b8bc299 to
467dcdb
Compare
python/ray/data/_internal/planner/plan_expression/expression_evaluator.py
Outdated
Show resolved
Hide resolved
467dcdb to
b9b1905
Compare
2b6da29 to
d9ef6e1
Compare
dab9ffa to
8f06393
Compare
python/ray/data/_internal/planner/plan_expression/expression_visitors.py
Outdated
Show resolved
Hide resolved
python/ray/data/_internal/planner/plan_expression/expression_visitors.py
Outdated
Show resolved
Hide resolved
1d995c9 to
4f9a2cc
Compare
4f9a2cc to
1cd247b
Compare
1cd247b to
dba8fc8
Compare
Signed-off-by: Goutam <goutam@anyscale.com>
dba8fc8 to
22ddef1
Compare
[Data] [2/n] - Update Project operator to use Expressions (ray-project#57855)
…t#57855) > Thank you for contributing to Ray! 🚀 > Please review the [Ray Contribution Guide](https://docs.ray.io/en/master/ray-contribute/getting-involved.html) before opening a pull request. ## Description 1. Add visitors for collecting column names from all expressions and renaming names across the tree. 2. Use expressions for rename_columns, with_column, select_columns and remove cols and cols_rename in Project 3. Modify Projection Pushdown to work with combinations of the above operators correctly ## Related issues Closes ray-project#56878, ray-project#57700 ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. Signed-off-by: Goutam <goutam@anyscale.com>
…t#57855) > Thank you for contributing to Ray! 🚀 > Please review the [Ray Contribution Guide](https://docs.ray.io/en/master/ray-contribute/getting-involved.html) before opening a pull request. ## Description 1. Add visitors for collecting column names from all expressions and renaming names across the tree. 2. Use expressions for rename_columns, with_column, select_columns and remove cols and cols_rename in Project 3. Modify Projection Pushdown to work with combinations of the above operators correctly ## Related issues Closes ray-project#56878, ray-project#57700 ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. Signed-off-by: Goutam <goutam@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Description
Related issues
Closes #56878, #57700
Additional information