-
Notifications
You must be signed in to change notification settings - Fork 22.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable stateless XNNPACK linear. #35791
Enable stateless XNNPACK linear. #35791
Conversation
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. ghstack-source-id: ee1679bce07882c2c9e94e65e9a0fe353c783cd4 Pull Request resolved: #35791
💊 Build failures summary and remediationsAs of commit 9ec51d7 (more details on the Dr. CI page):
XLA failureJob pytorch_xla_linux_bionic_py3_6_clang9_build is failing. Please create an issue with title prefixed by This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker. This comment has been revised 126 times. |
Need to see a test plan for this. Are we sure this is a win on desktop/server mkldnn is available? |
@dreiss Thanks, will limit to mobile only. Test test plan would be passing the CI but the thing is that this would not get much mobile coverage after I limit it to mobile only. How do you suggest I go about testing it more thoroughly? |
As long as it's gated to mobile-only, maybe running a convolutional model on the speed benchmark and verifying that it's faster? |
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. ghstack-source-id: 6318260db163e94b4030238b78ef08b42cf9b32e Pull Request resolved: #35791
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]
@AshkanAliabadi merged this pull request in ba3f8d3. |
Stack from ghstack:
Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul.
Differential Revision: D20821863