-
Notifications
You must be signed in to change notification settings - Fork 630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ready event to Tensor and TensorList. #5673
Conversation
- move SharedEventLease to core - add more complete shared_ptr interface to SharedEventLease - add tests for SharedEventLease - add (set_)ready_event to Tensor and TensorList - minor refactoring in TensorList - remove OperatorIO::event in favor of TensorList's ready_event in exec2 Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
CI MESSAGE: [19219483]: BUILD STARTED |
CI MESSAGE: [19219483]: BUILD FAILED |
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
CI MESSAGE: [19220893]: BUILD STARTED |
CI MESSAGE: [19220893]: BUILD PASSED |
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
CI MESSAGE: [19248455]: BUILD STARTED |
CI MESSAGE: [19248455]: BUILD PASSED |
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
CI MESSAGE: [19327083]: BUILD STARTED |
if (!ptr->shares_data()) { | ||
if (AccessOrder consumer_order = OutputConsumerStream(o)) | ||
ptr->set_order(consumer_order, false); | ||
if (ptr->is_pinned()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes a bug - previously we would set the consumer order for non-pinned host buffers - this PR adds an assert that verifies that such buffers are always in host order.
// Hack: use shared_ptr<void> to store a CUDA event - shared_ptr doesn't care whether the pointer | ||
// it manages is a real pointer or something else as long as: | ||
// - null value is equivalent to nullptr | ||
// - the provided deleter can free the object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure: is this guaranteed by the standard, or is it an implementation detail of shared_ptr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's put it differently: we've already been using it this way for 6 years - we use shared_ptr
to manage device memory, which is also non-dereferenceble on host, so it's effectively an opaque handle.
CI MESSAGE: [19327083]: BUILD PASSED |
Category:
New feature (non-breaking change which adds functionality)
Refactoring (Redesign of existing code that doesn't affect functionality)
Description:
Preliminary work for (cleaner) DLPack support.
DLPack needs to synchronize an stream so that the tensor is ready for use in that stream.
We can't use stream-to-stream synchronization because of prefetching. This PR adds ready_event to Tensor and TensorList to address that.
Additional information:
Affected modules and functionalities:
Executor2
Tensor
TensorList
Key points relevant for the review:
Tests:
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A