Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[data] New executor backend [3/n]--- Add basic operators impl #31305

Merged
merged 21 commits into from
Jan 4, 2023

Conversation

ericl
Copy link
Contributor

@ericl ericl commented Dec 23, 2022

Why are these changes needed?

Add the initial operator implementations.

This is split out from #30903

TODO:

  • Add unit test for AllToAllOp
  • Add unit test for MapOp
  • Add unit test for InputDataBuffer

Signed-off-by: Eric Liang <ekhliang@gmail.com>
Signed-off-by: Eric Liang <ekhliang@gmail.com>
Signed-off-by: Eric Liang <ekhliang@gmail.com>
@ericl ericl changed the title [WIP] [data] New executor backend [3/n]--- Add basic operators impl [data] New executor backend [3/n]--- Add basic operators impl Dec 23, 2022
assert _take_outputs(op) == [[i] for i in range(10)]


def test_map_operator_ray_args(shutdown_only):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debating whether it's worth it to mock out the Ray API here to speed up these tests a bit. Maybe it's not that important since the bulk of the testing will be for StreamingExecutor, which we can write separate mocks for.


Supported strategies: {TaskPoolStrategy, ActorPoolStrategy}.
"""
return self._strategy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I wonder if we can keep the implementation details of compute strategy and ray remote args etc outside of the operators? It could be cleaner if we pass in the ray.remote Callable instead of the worker's Callable as the transform_fn but not sure if this will work so I'll leave it up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I have a TODO on line 78 to clean this up in the future. I'm hoping the ComputeStrategy can turn into a simple dataclass once we migrate fully to the new backend. Right now, I avoided doing this refactoring to keep the changes self-contained.

About the callable, I think that's possible but it's probably also easier to do once we have the logical optimization layer in place (the optimizer could generate the ray.remote callable).

Signed-off-by: Eric Liang <ekhliang@gmail.com>
Copy link
Contributor

@stephanie-wang stephanie-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@ericl
Copy link
Contributor Author

ericl commented Jan 3, 2023

I'll hold this open until EOD for more comments.

input_op: Operator generating input data for this op.
name: The name of this operator.
compute_strategy: Customize the compute strategy for this op.
min_rows_per_batch: The number of rows to gather per batch passed to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we name it as min_rows_per_fn_call? batch is kind of confusing here, as this is neither user-facing batch in map_batches, nor zero-copy batch execution we shall introduce later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batch seems clearer to me: it basically is the same as the user facing batch size.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should keep using "target_row_per_batch", since there is not guarantee for "min" here. And we should clarify it's possible the target is not met when not enough rows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, the previous naming was very confusing for me. The new one is clear in intent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericl But that intent is incorrect: this is a target to get near to, not a minimum/floor. We add blocks to a bundle up to this target size, but we purposefully do not exceed it, so this is definitely not a minimum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, let me rename this to min_rows_per_bundle then. I don't think it's possible to be precisely unambiguous, and would prefer we keep the "min" intent which is the big picture.

Signed-off-by: Eric Liang <ekhliang@gmail.com>
input_op: Operator generating input data for this op.
name: The name of this operator.
compute_strategy: Customize the compute strategy for this op.
min_rows_per_batch: The number of rows to gather per batch passed to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should keep using "target_row_per_batch", since there is not guarantee for "min" here. And we should clarify it's possible the target is not met when not enough rows.

Signed-off-by: Eric Liang <ekhliang@gmail.com>
Copy link
Contributor

@clarkzinzow clarkzinzow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits, the only potential blocker in my mind is the question around the block bundling logic: it appears to be dropping empty blocks, which I don't think is the current Datasets behavior.

ericl and others added 8 commits January 3, 2023 17:25
Signed-off-by: Eric Liang <ekhliang@gmail.com>
Co-authored-by: Clark Zinzow <clarkzinzow@gmail.com>
Signed-off-by: Eric Liang <ekhliang@gmail.com>
Signed-off-by: Eric Liang <ekhliang@gmail.com>
Signed-off-by: Eric Liang <ekhliang@gmail.com>
Signed-off-by: Eric Liang <ekhliang@gmail.com>
self._obj_store_mem_peak: int = 0

def add_input(self, bundle: RefBundle) -> None:
if self._min_rows_per_bundle is None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up putting this back, in order to enable empty block propagation.

Copy link
Contributor Author

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated; main changes was I removed the circular dependency between the operator impl and the wrapper operator.

Signed-off-by: Eric Liang <ekhliang@gmail.com>
@ericl ericl merged commit 4195de1 into ray-project:master Jan 4, 2023
AmeerHajAli pushed a commit that referenced this pull request Jan 12, 2023
Add the initial operator implementations.

This is split out from #30903
tamohannes pushed a commit to ju2ez/ray that referenced this pull request Jan 25, 2023
…oject#31305)

Add the initial operator implementations.

This is split out from ray-project#30903

Signed-off-by: tmynn <hovhannes.tamoyan@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants