Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix as_tensor not keeping the parent alive in Python #60

Merged
merged 4 commits into from
Jul 20, 2018

Conversation

ptrendx
Copy link
Member

@ptrendx ptrendx commented Jul 19, 2018

No description provided.

Signed-off-by: ptredak <ptredak@nvidia.com>
@ptrendx ptrendx requested a review from JanuszL July 19, 2018 17:54
@ptrendx
Copy link
Member Author

ptrendx commented Jul 19, 2018

Fix for segfault in #9

@@ -266,6 +282,16 @@ class DLL_PUBLIC TensorList : public Buffer<Backend> {
return true;
}

Tensor<Backend> * AsTensor() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add docs to this method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

* return 'true'.
*/
DLL_PUBLIC inline void ShareData(TensorList<Backend> *other) {
DALI_ENFORCE(other != nullptr, "Input TensorList is nullptr");
DALI_ENFORCE(IsValidType(other->type_), "To share data, "
"the input TensorList must have a valid data type");

// Tensor view of this TensorList is no longer valid
delete tensor_view_;
tensor_view_ = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if python will handle this, what if we call AsTensor, return pointer, call ShareData, then pointer we just returned to python is no longer valid but python still can reference it, or I'm wrong?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a much longer discussion - in order to handle this we would need to store information about all the objects sharing data with a particular tensor or tensorlist and notify them that the data is no longer valid. This would potentially have performance implications.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see what you mean. I was referring to the other problem with sharing data. Hmmm, instead of deleting the pointer I guess I could keep it but invoke ShareData again after the internal pointer to data_ changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is some solution to our problem.
Regarding cores reference I following approaches

  • an object which shares its own data could keep a list of reference and when change notify and invalidate the state of other objects
  • an object that shares data of other object checks its validity
  • some kind of RCU mechanism - then we will keep all data valid
    First and second requires a lot of synchronization that could hit performance.
    Another solution is to leave it as design assumption - like iterator invalidation when data we are iterating over changes.

@JanuszL JanuszL added this to the Release_0.1.1+ milestone Jul 19, 2018
ptrendx added 2 commits July 19, 2018 15:43
Signed-off-by: ptredak <ptredak@nvidia.com>
update its data instead.

Signed-off-by: ptredak <ptredak@nvidia.com>
Copy link
Contributor

@JanuszL JanuszL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about thread safety? If TensorList will call ShareData when other thread performs operation on Tensor that shares its view?

Signed-off-by: ptredak <ptredak@nvidia.com>
@ptrendx
Copy link
Member Author

ptrendx commented Jul 20, 2018

I thought about this but that would be an error in the user code if this happened - this is undefined behavior either way.

Copy link
Contributor

@JanuszL JanuszL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM then

@ptrendx ptrendx merged commit eb6c86f into NVIDIA:master Jul 20, 2018
@ptrendx ptrendx deleted the fix_as_tensor branch August 7, 2018 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants