-
Notifications
You must be signed in to change notification settings - Fork 629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix as_tensor not keeping the parent alive in Python #60
Conversation
Signed-off-by: ptredak <ptredak@nvidia.com>
Fix for segfault in #9 |
dali/pipeline/data/tensor_list.h
Outdated
@@ -266,6 +282,16 @@ class DLL_PUBLIC TensorList : public Buffer<Backend> { | |||
return true; | |||
} | |||
|
|||
Tensor<Backend> * AsTensor() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add docs to this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
dali/pipeline/data/tensor_list.h
Outdated
* return 'true'. | ||
*/ | ||
DLL_PUBLIC inline void ShareData(TensorList<Backend> *other) { | ||
DALI_ENFORCE(other != nullptr, "Input TensorList is nullptr"); | ||
DALI_ENFORCE(IsValidType(other->type_), "To share data, " | ||
"the input TensorList must have a valid data type"); | ||
|
||
// Tensor view of this TensorList is no longer valid | ||
delete tensor_view_; | ||
tensor_view_ = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if python will handle this, what if we call AsTensor, return pointer, call ShareData, then pointer we just returned to python is no longer valid but python still can reference it, or I'm wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a much longer discussion - in order to handle this we would need to store information about all the objects sharing data with a particular tensor or tensorlist and notify them that the data is no longer valid. This would potentially have performance implications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see what you mean. I was referring to the other problem with sharing data. Hmmm, instead of deleting the pointer I guess I could keep it but invoke ShareData
again after the internal pointer to data_
changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is some solution to our problem.
Regarding cores reference I following approaches
- an object which shares its own data could keep a list of reference and when change notify and invalidate the state of other objects
- an object that shares data of other object checks its validity
- some kind of RCU mechanism - then we will keep all data valid
First and second requires a lot of synchronization that could hit performance.
Another solution is to leave it as design assumption - like iterator invalidation when data we are iterating over changes.
Signed-off-by: ptredak <ptredak@nvidia.com>
update its data instead. Signed-off-by: ptredak <ptredak@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about thread safety? If TensorList will call ShareData when other thread performs operation on Tensor that shares its view?
Signed-off-by: ptredak <ptredak@nvidia.com>
I thought about this but that would be an error in the user code if this happened - this is undefined behavior either way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM then
No description provided.