Enable stateless XNNPACK linear. #35791

AshkanAliabadi · 2020-04-01T02:45:35Z

Stack from ghstack:

Enable torch_speed_benchmark to accept different memory formats. #36202 Enable torch_speed_benchmark to accept different memory formats.
Enable stateless XNNPACK linear. #35791 Enable stateless XNNPACK linear.
Enable stateless XNNPACK convolutions. #35790 Enable stateless XNNPACK convolutions.
Mobile CPU allocator. #36032 Mobile CPU allocator.

Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul.

Differential Revision: D20821863

The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. [ghstack-poisoned]

The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. ghstack-source-id: ee1679bce07882c2c9e94e65e9a0fe353c783cd4 Pull Request resolved: #35791

dr-ci · 2020-04-01T02:46:54Z

💊 Build failures summary and remediations

As of commit 9ec51d7 (more details on the Dr. CI page):

1/1 failures introduced in this PR

XLA failure

Job pytorch_xla_linux_bionic_py3_6_clang9_build is failing. Please create an issue with title prefixed by [PT_BREAK] in pytorch/xla and link to to this PR. If you have questions, please reach out to @ailzhang / @dlibenzi / @JackCaoG.

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 126 times.

dreiss · 2020-04-01T19:28:06Z

Need to see a test plan for this. Are we sure this is a win on desktop/server mkldnn is available?

AshkanAliabadi · 2020-04-01T19:44:55Z

@dreiss Thanks, will limit to mobile only. Test test plan would be passing the CI but the thing is that this would not get much mobile coverage after I limit it to mobile only. How do you suggest I go about testing it more thoroughly?

dreiss · 2020-04-01T19:59:17Z

As long as it's gated to mobile-only, maybe running a convolutional model on the speed benchmark and verifying that it's faster?

The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. [ghstack-poisoned]

The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. ghstack-source-id: 6318260db163e94b4030238b78ef08b42cf9b32e Pull Request resolved: #35791

kimishpatel

LGTM.

The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. [ghstack-poisoned]

The optimal solution to use XNNPACK is to separate operator creation from execution - also called prepacking the weights. If we have done our job properly, JIT must have caught and replaced nn.Linear on mobile with the prepacked versions. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm or at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]

Optimal usage of the linear operator would require weight prepacking. If we have done our job properly, JIT must have caught and replaced linear operations on mobile with their corresponding prepacked versions, hence enabling said optimal usage. Still, if we somehow end up in at::native::linear for whatever reason, it is still more efficient to go through XNNPACK than the alternatives of at::addmm and at::matmul. Differential Revision: [D20821863](https://our.internmc.facebook.com/intern/diff/D20821863) [ghstack-poisoned]

facebook-github-bot · 2020-04-23T22:26:07Z

@AshkanAliabadi merged this pull request in ba3f8d3.

This was referenced Apr 1, 2020

Avoid one unnecessary memory allocation in XNNPACK integration. #35350

Closed

Propagate input tensor names in XNNPACK backend. #35351

Closed

Support for XNNPACK max pooling operator. #35354

Closed

Enable stateless XNNPACK convolutions. #35790

Closed

AshkanAliabadi changed the title ~~Enable eager XNNPACK linear.~~ Enable stateless XNNPACK linear. Apr 1, 2020

AshkanAliabadi requested review from kimishpatel and dreiss April 1, 2020 20:54

kimishpatel approved these changes Apr 2, 2020

View reviewed changes

dreiss approved these changes Apr 2, 2020

View reviewed changes

AshkanAliabadi mentioned this pull request Apr 2, 2020

Add XNNPACK Clamp. #35896

Closed

Ashkan Aliabadi added 9 commits April 2, 2020 15:31

AshkanAliabadi mentioned this pull request Apr 4, 2020

Mobile CPU allocator. #36032

Closed

AshkanAliabadi reopened this Apr 7, 2020

AshkanAliabadi closed this Apr 7, 2020

AshkanAliabadi reopened this Apr 7, 2020

Ashkan Aliabadi added 4 commits April 7, 2020 12:13

AshkanAliabadi mentioned this pull request Apr 8, 2020

Enable torch_speed_benchmark to accept different memory formats. #36202

Closed

Ashkan Aliabadi added 15 commits April 7, 2020 21:06

facebook-github-bot closed this in ba3f8d3 Apr 23, 2020

facebook-github-bot added the merged label Apr 23, 2020

facebook-github-bot deleted the gh/AshkanAliabadi/15/head branch April 27, 2020 14:19

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable stateless XNNPACK linear. #35791

Enable stateless XNNPACK linear. #35791

AshkanAliabadi commented Apr 1, 2020 •

edited

Loading

dr-ci bot commented Apr 1, 2020 •

edited

Loading

dreiss commented Apr 1, 2020

AshkanAliabadi commented Apr 1, 2020

dreiss commented Apr 1, 2020

kimishpatel left a comment

facebook-github-bot commented Apr 23, 2020

Enable stateless XNNPACK linear. #35791

Enable stateless XNNPACK linear. #35791

Conversation

AshkanAliabadi commented Apr 1, 2020 • edited Loading

dr-ci bot commented Apr 1, 2020 • edited Loading

💊 Build failures summary and remediations

XLA failure

dreiss commented Apr 1, 2020

AshkanAliabadi commented Apr 1, 2020

dreiss commented Apr 1, 2020

kimishpatel left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Apr 23, 2020

AshkanAliabadi commented Apr 1, 2020 •

edited

Loading

dr-ci bot commented Apr 1, 2020 •

edited

Loading