Add TorchScriptable transformer classifier and subword BPE tokenizer #4566

zzhangncsu · 2022-06-01T22:50:45Z

Patch description

Enable Torchscripting of a Transformer classifier agent and subword BPE tokenizer.

Testing steps

Run unit tests:

pytest tests/test_torchscript.py

Exporting model

Call to export model:

parlai torchscript \
--model-file ${MODEL_FILE} \
--model transformer/classifier \
--script-module parlai.torchscript.modules:TorchScriptTransformerClassifier \
--no-cuda \
--scripted-model-file ~/_test_scripted_model__bart.pt \
--input 'This is a testing sentence'

Other information

Support GPU model without DataParallel . See pytorch/pytorch#30635 for more information.

…kenizer

stephenroller

WHOAAAA!!

Will leave @jxmsML to do a deeper review and approve. On the surface I see only minor issues.

Please black before committing.

parlai/torchscript/modules.py

stephenroller · 2022-06-02T03:33:35Z

parlai/torchscript/modules.py

        return text
+
+
+class TorchScriptTransformerClassifier(nn.Module):


This file is getting huge. I'd suggest we break modules.py up.

I moved tokenizer out of module.py. Do we want to split the modules further? e.g., generator_module and classifier_module.

I'll let you decide

I think it's good for now. If there are more models to add. We can group them further.

parlai/torchscript/modules.py

EricMichaelSmith

Minor nits

parlai/torchscript/modules.py

tests/nightly/gpu/test_torchscript.py

zzhangncsu · 2022-06-16T18:37:30Z

Sorry for keeping this PR pending 😔. I will work on the comment next week.

parlai/torchscript/tokenizer.py

zzhangncsu · 2022-06-30T04:37:50Z

@stephenroller @EricMichaelSmith Shall we merge it? 😃

EricMichaelSmith

Left a minor comment. The lint CI check is failing - I'd run autoformat.sh to resolve

tests/nightly/gpu/test_torchscript.py

parlai/torchscript/tokenizer.py

Co-authored-by: Eric Smith <EricMichaelSmith@users.noreply.github.com>

stephenroller · 2022-07-07T03:19:34Z

@klshuster calling the tech lead to force a decision: accept or iterate more?

EricMichaelSmith

@stephenroller Hmm it looks fine to approve to me - @zzhangncsu I added one last minor comment. Great to have this functionality and unit test coverage!

zzhangncsu · 2022-07-12T19:03:13Z

@EricMichaelSmith @ It seems I have a warning "1 workflow awaiting approval" since i'm a first-time contributor. Also the long_gpu_tests failes. Any suggestions?

EricMichaelSmith · 2022-07-12T23:34:35Z

@EricMichaelSmith @ It seems I have a warning "1 workflow awaiting approval" since i'm a first-time contributor. Also the long_gpu_tests failes. Any suggestions?

Yeah, I think one of us has to approve running the CI checks. The long_gpu_tests check fails with a sqlite3.OperationalError: database or disk is full error, which is weird - can you try doing a git merge main on this PR to see if maybe this error is something that's been fixed on main?

klshuster · 2022-07-13T00:16:42Z

Yeah, I think one of us has to approve running the CI checks. The long_gpu_tests check fails with a sqlite3.OperationalError: database or disk is full error, which is weird - can you try doing a git merge main on this PR to see if maybe this error is something that's been fixed on main?

I think that's unrelated to this PR. Also I trust @EricMichaelSmith here, if approved let's get it in

zzhangncsu · 2022-07-14T03:58:44Z

@EricMichaelSmith @ It seems I have a warning "1 workflow awaiting approval" since i'm a first-time contributor. Also the long_gpu_tests failes. Any suggestions?

Yeah, I think one of us has to approve running the CI checks. The long_gpu_tests check fails with a sqlite3.OperationalError: database or disk is full error, which is weird - can you try doing a git merge main on this PR to see if maybe this error is something that's been fixed on main?

I did the merge. seems not helpful.

EricMichaelSmith · 2022-07-14T13:07:15Z

@EricMichaelSmith @ It seems I have a warning "1 workflow awaiting approval" since i'm a first-time contributor. Also the long_gpu_tests failes. Any suggestions?

Yeah, I think one of us has to approve running the CI checks. The long_gpu_tests check fails with a sqlite3.OperationalError: database or disk is full error, which is weird - can you try doing a git merge main on this PR to see if maybe this error is something that's been fixed on main?

I did the merge. seems not helpful.

@stephenroller did you ever see this error when debugging CI checks? It looks to come from downloading http://parl.ai/downloads/_models/wikipedia_full/tfidf_retriever/model.tgz , nothing to do with this PR itself

zzhangncsu · 2022-07-20T21:45:26Z

@EricMichaelSmith Shall we merge the PR? :-)

EricMichaelSmith · 2022-07-20T21:47:40Z

@EricMichaelSmith Shall we merge the PR? :-)

Hmm, assuming that the long_gpu_tests PR is still the only one that fails, and assuming that it still fails given the error above, yes, I'd say that you're fine to merge now, given that we now see this same error in main :/

zzhangncsu · 2022-07-21T22:59:04Z

@EricMichaelSmith Shall we merge the PR? :-)

Hmm, assuming that the long_gpu_tests PR is still the only one that fails, and assuming that it still fails given the error above, yes, I'd say that you're fine to merge now, given that we now see this same error in main :/

Sounds good. I don't have the write permission though. Please feel free to merge it. Thanks!

EricMichaelSmith · 2022-07-25T14:45:50Z

@EricMichaelSmith Shall we merge the PR? :-)

Hmm, assuming that the long_gpu_tests PR is still the only one that fails, and assuming that it still fails given the error above, yes, I'd say that you're fine to merge now, given that we now see this same error in main :/

Sounds good. I don't have the write permission though. Please feel free to merge it. Thanks!

Just merged. Thanks again for doing this!

Zhe Zhang and others added 5 commits May 24, 2022 18:57

Add torchscript classes for Transformer classifier and subword BPE to…

1a20f7e

…kenizer

Merge branch 'facebookresearch:main' into main

ec42b85

Add unit tests

58ef1c0

Merge branch 'main' of github.com:zzhangncsu/ParlAI

0f5968a

Remove debugging code

d3ee5e5

facebook-github-bot added the CLA Signed label Jun 1, 2022

stephenroller reviewed Jun 2, 2022

View reviewed changes

jxmsML self-requested a review June 2, 2022 14:16

EricMichaelSmith reviewed Jun 2, 2022

View reviewed changes

parlai/torchscript/modules.py Outdated Show resolved Hide resolved

parlai/torchscript/modules.py Outdated Show resolved Hide resolved

parlai/torchscript/modules.py Outdated Show resolved Hide resolved

tests/nightly/gpu/test_torchscript.py Show resolved Hide resolved

zzhangncsu and others added 2 commits June 20, 2022 14:09

Merge branch 'facebookresearch:main' into main

964dfff

Resolve comments

03594b9

zzhangncsu requested review from stephenroller and EricMichaelSmith June 21, 2022 02:28

EricMichaelSmith reviewed Jun 21, 2022

View reviewed changes

parlai/torchscript/tokenizer.py Show resolved Hide resolved

EricMichaelSmith reviewed Jun 21, 2022

View reviewed changes

parlai/torchscript/tokenizer.py Show resolved Hide resolved

Merge branch 'facebookresearch:main' into main

fc37102

EricMichaelSmith reviewed Jul 5, 2022

View reviewed changes

tests/nightly/gpu/test_torchscript.py Show resolved Hide resolved

zzhangncsu and others added 2 commits July 6, 2022 14:05

Merge branch 'facebookresearch:main' into main

01208ee

Rerun autoformat.sh

4131d69

EricMichaelSmith reviewed Jul 6, 2022

View reviewed changes

parlai/torchscript/tokenizer.py Outdated Show resolved Hide resolved

Update parlai/torchscript/tokenizer.py

992cdd8

Co-authored-by: Eric Smith <EricMichaelSmith@users.noreply.github.com>

EricMichaelSmith approved these changes Jul 7, 2022

View reviewed changes

Merge branch 'facebookresearch:main' into main

7f8ea42

Merge branch 'facebookresearch:main' into main

3386283

EricMichaelSmith merged commit 73a395f into facebookresearch:main Jul 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TorchScriptable transformer classifier and subword BPE tokenizer #4566

Add TorchScriptable transformer classifier and subword BPE tokenizer #4566

zzhangncsu commented Jun 1, 2022

stephenroller left a comment

stephenroller Jun 2, 2022

zzhangncsu Jun 21, 2022

stephenroller Jun 25, 2022

zzhangncsu Jun 26, 2022

EricMichaelSmith left a comment

zzhangncsu commented Jun 16, 2022

zzhangncsu commented Jun 30, 2022

EricMichaelSmith left a comment

stephenroller commented Jul 7, 2022

EricMichaelSmith left a comment

zzhangncsu commented Jul 12, 2022

EricMichaelSmith commented Jul 12, 2022

klshuster commented Jul 13, 2022

zzhangncsu commented Jul 14, 2022

EricMichaelSmith commented Jul 14, 2022

zzhangncsu commented Jul 20, 2022

EricMichaelSmith commented Jul 20, 2022 •

edited

Loading

zzhangncsu commented Jul 21, 2022

EricMichaelSmith commented Jul 25, 2022

		return text


		class TorchScriptTransformerClassifier(nn.Module):

Add TorchScriptable transformer classifier and subword BPE tokenizer #4566

Add TorchScriptable transformer classifier and subword BPE tokenizer #4566

Conversation

zzhangncsu commented Jun 1, 2022

Patch description

Testing steps

Exporting model

Other information

stephenroller left a comment

Choose a reason for hiding this comment

stephenroller Jun 2, 2022

Choose a reason for hiding this comment

zzhangncsu Jun 21, 2022

Choose a reason for hiding this comment

stephenroller Jun 25, 2022

Choose a reason for hiding this comment

zzhangncsu Jun 26, 2022

Choose a reason for hiding this comment

EricMichaelSmith left a comment

Choose a reason for hiding this comment

zzhangncsu commented Jun 16, 2022

zzhangncsu commented Jun 30, 2022

EricMichaelSmith left a comment

Choose a reason for hiding this comment

stephenroller commented Jul 7, 2022

EricMichaelSmith left a comment

Choose a reason for hiding this comment

zzhangncsu commented Jul 12, 2022

EricMichaelSmith commented Jul 12, 2022

klshuster commented Jul 13, 2022

zzhangncsu commented Jul 14, 2022

EricMichaelSmith commented Jul 14, 2022

zzhangncsu commented Jul 20, 2022

EricMichaelSmith commented Jul 20, 2022 • edited Loading

zzhangncsu commented Jul 21, 2022

EricMichaelSmith commented Jul 25, 2022

EricMichaelSmith commented Jul 20, 2022 •

edited

Loading