-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make write an operator as part of the execution plan #32015
Merged
Merged
Changes from all commits
Commits
Show all changes
51 commits
Select commit
Hold shift + click to select a range
edc51bd
Fix read_tfrecords_benchmark nightly test
jianoaix 61f4d6d
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix a33a943
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix 36ebe52
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix ce6763e
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix 0e2c29e
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix f2b6ed0
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix bb6c5c4
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix 540fe79
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix edad7d0
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix 60cc079
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix a3d3980
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix 001579c
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix 8aeed6c
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix 7a9a49b
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix ef97167
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix 6f0563c
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix bcec4d6
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix ddef4e5
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix fc9a175
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix f0e90b7
Merge branch 'master' of https://github.com/ray-project/ray
jianoaix 0c820ea
Merge branch 'master' of https://github.com/ray-project/ray into writ…
jianoaix 253da6a
Make write an operator as part of the execution plan
jianoaix 514ec14
fix / fix do_write
jianoaix 204ea5f
Merge branch 'master' of https://github.com/ray-project/ray into writ…
jianoaix b75bf49
Fix the merge
jianoaix 6a28257
fix arg passing
jianoaix 3082e52
lint
jianoaix 5ddfdc0
Reconcile taskcontext
jianoaix 3843ff4
Reconcile taskcontext continued
jianoaix 012a413
Merge branch 'master' of https://github.com/ray-project/ray into writ…
jianoaix 84a74f0
Use task context in write op
jianoaix bb2a474
fix test
jianoaix ad5f7c7
feedback: backward compatibility
jianoaix a77053a
fix
jianoaix 554171a
test write fusion
jianoaix 1ba1b9f
Result of write operator; datasource callbacks
jianoaix 5ced246
Handle an empty list on failure
jianoaix 43eca29
execute the plan in-place in write_datasource
jianoaix f25d54b
Keep write_datasource semantics diff-neutral regarding the plan
jianoaix c5ddf07
Merge branch 'master' of https://github.com/ray-project/ray into writ…
jianoaix 1d58e13
disable the write_XX in new optimizer: it's not supported yet
jianoaix d309dbd
fix comment
jianoaix 21a50db
refactor: do_write() calls direct_write() to reduce code duplication
jianoaix a84e27b
refactor: for mongo datasource do_write
jianoaix 8879df0
backward compatible
jianoaix d6873e1
rename: direct_write -> write
jianoaix 10ef980
unnecessary test removed
jianoaix 48e9415
fix
jianoaix 87dc925
deprecation message/logging
jianoaix b77ca8d
deprecation logging
jianoaix File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't the same semantics right? on_write_complete/failed should be called on the driver only, not once per block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is changed to per block/task. I'm not sure how to properly support this with write becoming an operator that can be fused with prior operators.
Do these callbacks have to be on the driver? It looks they are not quite used yet, so unclear the use case. But as callbacks for each write task seems a fine semantics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd need these to implement commit semantics / rollback changes on failure, for example. They are part of the DataSource public API, so we shouldn't change them without a separate API discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel it's too rush for 2.3 release if we need to change
Datasource
API at this moment, and agree that separate discussion needs to happen if we changeDatasource
API here. How about we taking a non-intrusive approach to not changeDatasource
API at all?How about we are doing this:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is no issue blocked by not having write operator/fusion at the moment, so I'd fix it properly instead of having a half baked feature (2.3 is not the ultimate cut to catch anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the status Dataset approach still works right? Errors can get raised as usual, so retries will work. In the write function here we can add a try catch around the entire Dataset execution operation, and call the right data source callbacks if the entire thing fails / succeeds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, do you mean it's hard to provide List[ObjectRef[WriteResult]] to the error callback? I think we can probably deprecate this part of the API for the new backend (pass empty list). Afaik it's not really useful and would be best effort in any scenario for a distributed system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, we will not know the resulting status of each individual tasks, which I think may be used for error handling (rollback etc.) in the callback function. If we just try catch the entire execution, and just produce an empty list (as the result of executing write operator), then yes it should be able to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we go for that then? It seems the simplest approach, and we can also fix the new API to not have the list of error refs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I'll update the PR.