Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable stateless XNNPACK linear. #35791

Closed

Conversation

AshkanAliabadi
Copy link
Contributor

@AshkanAliabadi AshkanAliabadi commented Apr 1, 2020

Stack from ghstack:

Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: D20821863

The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

[ghstack-poisoned]
AshkanAliabadi pushed a commit that referenced this pull request Apr 1, 2020
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

ghstack-source-id: ee1679bce07882c2c9e94e65e9a0fe353c783cd4
Pull Request resolved: #35791
@dr-ci
Copy link

dr-ci bot commented Apr 1, 2020

💊 Build failures summary and remediations

As of commit 9ec51d7 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

XLA failure

Job pytorch_xla_linux_bionic_py3_6_clang9_build is failing. Please create an issue with title prefixed by [PT_BREAK] in pytorch/xla and link to to this PR. If you have questions, please reach out to @ailzhang / @dlibenzi / @JackCaoG.


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 126 times.

@AshkanAliabadi AshkanAliabadi changed the title Enable eager XNNPACK linear. Enable stateless XNNPACK linear. Apr 1, 2020
@dreiss
Copy link
Contributor

dreiss commented Apr 1, 2020

Need to see a test plan for this. Are we sure this is a win on desktop/server mkldnn is available?

@AshkanAliabadi
Copy link
Contributor Author

@dreiss Thanks, will limit to mobile only. Test test plan would be passing the CI but the thing is that this would not get much mobile coverage after I limit it to mobile only. How do you suggest I go about testing it more thoroughly?

@dreiss
Copy link
Contributor

dreiss commented Apr 1, 2020

As long as it's gated to mobile-only, maybe running a convolutional model on the speed benchmark and verifying that it's faster?

The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

[ghstack-poisoned]
AshkanAliabadi pushed a commit that referenced this pull request Apr 1, 2020
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

ghstack-source-id: 6318260db163e94b4030238b78ef08b42cf9b32e
Pull Request resolved: #35791
Copy link
Contributor

@kimishpatel kimishpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

[ghstack-poisoned]
Ashkan Aliabadi added 9 commits April 2, 2020 15:31
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
@AshkanAliabadi AshkanAliabadi reopened this Apr 7, 2020
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Ashkan Aliabadi added 4 commits April 7, 2020 12:13
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Ashkan Aliabadi added 15 commits April 7, 2020 21:06
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights.  If we have done
our job properly, JIT must have caught and replaced nn.Linear on mobile
with the prepacked versions.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm or at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking.
If we have done our job properly, JIT must have caught and replaced
linear operations on mobile with their corresponding prepacked versions,
hence enabling said optimal usage.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking.
If we have done our job properly, JIT must have caught and replaced
linear operations on mobile with their corresponding prepacked versions,
hence enabling said optimal usage.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking.
If we have done our job properly, JIT must have caught and replaced
linear operations on mobile with their corresponding prepacked versions,
hence enabling said optimal usage.  Still, if we somehow end up in
at::native::linear for whatever reason, it is still more efficient to go
through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage.  Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage.  Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage.  Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage.  Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage.  Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage.  Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage.  Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage.  Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage.  Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage.  Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

@AshkanAliabadi merged this pull request in ba3f8d3.

@facebook-github-bot facebook-github-bot deleted the gh/AshkanAliabadi/15/head branch April 27, 2020 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants