[Tutorial] Demo showing how to run a pruned 🤗 model. #5975

jwfromm · 2020-07-01T21:48:58Z

This tutorial demonstrates how to load and run a sparse model from the popular transformers module from Hugging Face (🤗). Very recently a 95% sparse version of BERT was made publicly available however 🤗 was unable to achieve speedups using existing frameworks. Using this script, TVM enables a 2-3X speedup by converting appropriate dense layers to sparse dense layers. I think this will be a useful tutorial for user's interested in sparse networks and may be good PR for TVM as a small collaboration with 🤗. Thanks @antinucleon for helping getting this all working!

jwfromm · 2020-07-01T21:50:29Z

@masahi, @vinx13, @binarybana can you take a look and let me know what you think?

tqchen · 2020-07-01T22:16:00Z

cc @antinucleon @junrushao1994

masahi · 2020-07-01T22:59:34Z

I liked emojis in the PR:)

How about adding a sample output, with avx2 or 512?

binarybana

Looks great! Made some edits.

tutorials/frontend/deploy_sparse.py

merrymercy · 2020-07-02T05:12:01Z

How long does this tutorial take to run?
If it takes a lot of time, it is better to provide some sample outputs, so readers can know what is expected.
Otherwise, it is better to let it run on the CI server, so we can get output from the web server and make sure the tutorial is always runnable.

jwfromm · 2020-07-02T14:53:27Z

@merrymercy it's fairly quick, I commented out the run command due to dependencies rather than the run time. This tutorial requires tensorflow 2.2 (our servers currently use 2.1) and transformers. If we think its worth updating the server build then we can run this for real.

u99127 · 2020-07-02T18:35:28Z

@merrymercy it's fairly quick, I commented out the run command due to dependencies rather than the run time. This tutorial requires tensorflow 2.2 (our servers currently use 2.1) and transformers. If we think its worth updating the server build then we can run this for real.

A +1 for updating TF versions and keeping this running out of the box.

tqchen · 2020-07-02T21:20:06Z

Thanks everyone, this is merged, will ping the thread again once we have TF 2.2 landed in the CI

Wheest · 2020-07-09T16:15:40Z

Note that the .ipynb version of the tutorial doesn't work when running benchmark(), since it uses the __file__ variable in import_graphdef(), which is not defined in most notebook environments. Alternative approach to getting path may be needed.

There are also a lot of dependencies for the tutorial (e.g. transformers, tensorflow) which may not be in a user's environment. Should an Install dependencies section be added à la?

Added tutorial showing how to run a sparse transformer model.

0efd635

Move transformers import to avoid dependency during doc generation.

6de4886

tqchen approved these changes Jul 1, 2020

View reviewed changes

binarybana suggested changes Jul 2, 2020

View reviewed changes

binarybana reviewed Jul 2, 2020

View reviewed changes

tutorials/frontend/deploy_sparse.py Show resolved Hide resolved

Fixed formatting bugs.

08676a8

jwfromm added 2 commits July 2, 2020 09:52

Jason edits added.

72ef919

Added sample output.

c62f8ed

binarybana approved these changes Jul 2, 2020

View reviewed changes

tqchen merged commit a519292 into apache:master Jul 2, 2020

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Jul 14, 2020

[Tutorial] Demo showing how to run a pruned 🤗 model. (apache#5975)

395313a

trevor-m pushed a commit to neo-ai/tvm that referenced this pull request Jul 14, 2020

[Tutorial] Demo showing how to run a pruned 🤗 model. (apache#5975)

3a26a43

jwfromm deleted the hf_demo_2 branch August 13, 2020 00:28

ZihengJiang mentioned this pull request Sep 25, 2020

TVM v0.7 Release Note Candidate #6486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tutorial] Demo showing how to run a pruned 🤗 model. #5975

[Tutorial] Demo showing how to run a pruned 🤗 model. #5975

jwfromm commented Jul 1, 2020 •

edited

Loading

jwfromm commented Jul 1, 2020

tqchen commented Jul 1, 2020

masahi commented Jul 1, 2020

binarybana left a comment

merrymercy commented Jul 2, 2020 •

edited

Loading

jwfromm commented Jul 2, 2020 •

edited

Loading

u99127 commented Jul 2, 2020 •

edited

Loading

tqchen commented Jul 2, 2020

Wheest commented Jul 9, 2020

[Tutorial] Demo showing how to run a pruned 🤗 model. #5975

[Tutorial] Demo showing how to run a pruned 🤗 model. #5975

Conversation

jwfromm commented Jul 1, 2020 • edited Loading

jwfromm commented Jul 1, 2020

tqchen commented Jul 1, 2020

masahi commented Jul 1, 2020

binarybana left a comment

Choose a reason for hiding this comment

merrymercy commented Jul 2, 2020 • edited Loading

jwfromm commented Jul 2, 2020 • edited Loading

u99127 commented Jul 2, 2020 • edited Loading

tqchen commented Jul 2, 2020

Wheest commented Jul 9, 2020

jwfromm commented Jul 1, 2020 •

edited

Loading

merrymercy commented Jul 2, 2020 •

edited

Loading

jwfromm commented Jul 2, 2020 •

edited

Loading

u99127 commented Jul 2, 2020 •

edited

Loading