-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Data] Remove in_blocks
parameter of ExecutionPlan
#45860
[Data] Remove in_blocks
parameter of ExecutionPlan
#45860
Conversation
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Co-authored-by: Scott Lee <scottjlee@users.noreply.github.com> Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
# "+- MapBatches(<lambda>)\n" | ||
# "+- Union\n" | ||
# " +- Dataset(num_rows=9, schema={id: int64})" | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this be fixed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'll address in a follow-up PR next week.
The current logic aggregates the metadata (i.e., num_rows and schema) from all the union operators; I think I'll also be able to improve this so that it lists each input dataset separately.
datasets = [self] + list(other) | ||
has_nonlazy = any(not ds._plan.is_read_only() for ds in datasets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice getting rid of this logic.
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
After #45860, LazyBlockList is dead code. Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
After ray-project#45860, LazyBlockList is dead code. Signed-off-by: Balaji Veeramani <balaji@anyscale.com>
…5860) These changes are part of a larger effort to remove LazyBlockList. Currently, when you create an ExecutionPlan, you need to pass in a BlockList. A BlockList can either be a regular BlockList which represents in-memory data or a LazyBlockList which represents ReadTasks. The issue is that there's no way to represent lazy input data that doesn't use ReadTasks. To address this, this PR removes the in_blocks parameter of ExecutionPlan. It follows ray-project#45852. Signed-off-by: Balaji Veeramani <balaji@anyscale.com> Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Co-authored-by: Scott Lee <scottjlee@users.noreply.github.com> Signed-off-by: Saihajpreet Singh <c-saihajpreet.singh@anyscale.com>
…5860) These changes are part of a larger effort to remove LazyBlockList. Currently, when you create an ExecutionPlan, you need to pass in a BlockList. A BlockList can either be a regular BlockList which represents in-memory data or a LazyBlockList which represents ReadTasks. The issue is that there's no way to represent lazy input data that doesn't use ReadTasks. To address this, this PR removes the in_blocks parameter of ExecutionPlan. It follows ray-project#45852. Signed-off-by: Balaji Veeramani <balaji@anyscale.com> Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu> Co-authored-by: Scott Lee <scottjlee@users.noreply.github.com> Signed-off-by: 982945902 <982945902@qq.com>
bveeramani/ray@change-execute-return...bveeramani:ray:remove-blocks-parameter
Stacked on
ExecutionPlan.execute
returnRefBundle
instead ofBlockList
#45852count()
andschema()
to not rely onLazyBlockList
#45489Why are these changes needed?
These changes are part of a larger effort to remove
LazyBlockList
.Currently, when you create an
ExecutionPlan
, you need to pass in aBlockList
. ABlockList
can either be a regularBlockList
which represents in-memory data or aLazyBlockList
which representsReadTasks
. The issue is that there's no way to represent lazy input data that doesn't useReadTasks
.To address this, this PR removes the
in_blocks
parameter ofExecutionPlan
. It follows #45852.Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.