Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid one unnecessary memory allocation in XNNPACK integration. #35350

Closed

Conversation

AshkanAliabadi
Copy link
Contributor

@AshkanAliabadi AshkanAliabadi commented Mar 25, 2020

Stack from ghstack:

Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format. The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination. Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.

Differential Revision: D20656798

Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format.  The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination.  Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.

[ghstack-poisoned]
AshkanAliabadi pushed a commit that referenced this pull request Mar 25, 2020
Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format.  The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination.  Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.

ghstack-source-id: 1c498a6e2287be38c252642a50f310676f258998
Pull Request resolved: #35350
@dr-ci
Copy link

dr-ci bot commented Mar 25, 2020

💊 CircleCI build failures summary and remediations

As of commit ac58495 (more details on the Dr. CI page):


None of the build failures appear to be your fault 💚


  • 1/1 broken upstream at merge base 2a4ca70 since Apr 02

    Please rebase on the viable/strict branch (expand for instructions)

    If your commit is newer than viable/strict, you can try basing on an older, stable commit:

    git fetch https://github.com/pytorch/pytorch viable/strict
    git rebase --onto FETCH_HEAD $(git merge-base origin/master HEAD)
    

    If your commit is older than viable/strict:

    git fetch https://github.com/pytorch/pytorch viable/strict
    git rebase FETCH_HEAD
    

    Check out the recency history of this "viable master" tracking branch.


🚧 1 upstream failure:

These were probably caused by upstream breakages:


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 30 times.

@AshkanAliabadi
Copy link
Contributor Author

Do we have any tests that I can use to validate that the results are correct? I'm not sure this code is exercised through CI.

…tion."

Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format.  The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination.  Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.

[ghstack-poisoned]
@kimishpatel
Copy link
Contributor

Do we have any tests that I can use to validate that the results are correct? I'm not sure this code is exercised through CI.

You can try python test/test_xnnpack_integration.py. I suggest add, in one or two tests, trying different memory formats for inputs, including channels first and channels last.

Copy link
Contributor

@kimishpatel kimishpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG for the most part. Can you add the test I mentioned in the comment or try the existing tests?

Ashkan Aliabadi added 5 commits March 25, 2020 12:48
…tion."

Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format.  The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination.  Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.

[ghstack-poisoned]
…tion."

Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format.  The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination.  Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.

[ghstack-poisoned]
…tion."

Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format.  The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination.  Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.

[ghstack-poisoned]
…tion."

Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format.  The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination.  Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.

Differential Revision: [D20656798](https://our.internmc.facebook.com/intern/diff/D20656798)

[ghstack-poisoned]
…tion."

Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format.  The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination.  Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.

Differential Revision: [D20656798](https://our.internmc.facebook.com/intern/diff/D20656798)

[ghstack-poisoned]
Ashkan Aliabadi added 2 commits April 1, 2020 13:51
…tion."

Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format.  The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination.  Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.

Differential Revision: [D20656798](https://our.internmc.facebook.com/intern/diff/D20656798)

[ghstack-poisoned]
…tion."


Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format.  The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination.  Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.

Differential Revision: [D20656798](https://our.internmc.facebook.com/intern/diff/D20656798)

[ghstack-poisoned]
Ashkan Aliabadi added 2 commits April 2, 2020 15:31
…tion."


Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format.  The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination.  Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.

Differential Revision: [D20656798](https://our.internmc.facebook.com/intern/diff/D20656798)

[ghstack-poisoned]
…tion."


Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format.  The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination.  Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.

Differential Revision: [D20656798](https://our.internmc.facebook.com/intern/diff/D20656798)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

@AshkanAliabadi merged this pull request in d0ce94d.

@facebook-github-bot facebook-github-bot deleted the gh/AshkanAliabadi/11/head branch April 6, 2020 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants