Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Add TorchScriptable transformer classifier and subword BPE tokenizer #4566

Merged
merged 13 commits into from
Jul 25, 2022

Conversation

zzhangncsu
Copy link
Contributor

Patch description

Enable Torchscripting of a Transformer classifier agent and subword BPE tokenizer.

Testing steps

Run unit tests:

pytest tests/test_torchscript.py

Exporting model

Call to export model:

parlai torchscript \
--model-file ${MODEL_FILE} \
--model transformer/classifier \
--script-module parlai.torchscript.modules:TorchScriptTransformerClassifier \
--no-cuda \
--scripted-model-file ~/_test_scripted_model__bart.pt \
--input 'This is a testing sentence'

Other information

Support GPU model without DataParallel . See pytorch/pytorch#30635 for more information.

Copy link
Contributor

@stephenroller stephenroller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WHOAAAA!!

Will leave @jxmsML to do a deeper review and approve. On the surface I see only minor issues.

Please black before committing.

parlai/torchscript/modules.py Outdated Show resolved Hide resolved
parlai/torchscript/modules.py Outdated Show resolved Hide resolved
return text


class TorchScriptTransformerClassifier(nn.Module):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is getting huge. I'd suggest we break modules.py up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved tokenizer out of module.py. Do we want to split the modules further? e.g., generator_module and classifier_module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let you decide

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good for now. If there are more models to add. We can group them further.

parlai/torchscript/modules.py Outdated Show resolved Hide resolved
@jxmsML jxmsML self-requested a review June 2, 2022 14:16
Copy link
Contributor

@EricMichaelSmith EricMichaelSmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nits

parlai/torchscript/modules.py Outdated Show resolved Hide resolved
parlai/torchscript/modules.py Outdated Show resolved Hide resolved
parlai/torchscript/modules.py Outdated Show resolved Hide resolved
tests/nightly/gpu/test_torchscript.py Show resolved Hide resolved
@zzhangncsu
Copy link
Contributor Author

Sorry for keeping this PR pending 😔. I will work on the comment next week.

@zzhangncsu
Copy link
Contributor Author

@stephenroller @EricMichaelSmith Shall we merge it? 😃

Copy link
Contributor

@EricMichaelSmith EricMichaelSmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a minor comment. The lint CI check is failing - I'd run autoformat.sh to resolve

tests/nightly/gpu/test_torchscript.py Show resolved Hide resolved
Co-authored-by: Eric Smith <EricMichaelSmith@users.noreply.github.com>
@stephenroller
Copy link
Contributor

@klshuster calling the tech lead to force a decision: accept or iterate more?

Copy link
Contributor

@EricMichaelSmith EricMichaelSmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephenroller Hmm it looks fine to approve to me - @zzhangncsu I added one last minor comment. Great to have this functionality and unit test coverage!

@zzhangncsu
Copy link
Contributor Author

@EricMichaelSmith @ It seems I have a warning "1 workflow awaiting approval" since i'm a first-time contributor. Also the long_gpu_tests failes. Any suggestions?

@EricMichaelSmith
Copy link
Contributor

@EricMichaelSmith @ It seems I have a warning "1 workflow awaiting approval" since i'm a first-time contributor. Also the long_gpu_tests failes. Any suggestions?

Yeah, I think one of us has to approve running the CI checks. The long_gpu_tests check fails with a sqlite3.OperationalError: database or disk is full error, which is weird - can you try doing a git merge main on this PR to see if maybe this error is something that's been fixed on main?

@klshuster
Copy link
Contributor

Yeah, I think one of us has to approve running the CI checks. The long_gpu_tests check fails with a sqlite3.OperationalError: database or disk is full error, which is weird - can you try doing a git merge main on this PR to see if maybe this error is something that's been fixed on main?

I think that's unrelated to this PR. Also I trust @EricMichaelSmith here, if approved let's get it in

@zzhangncsu
Copy link
Contributor Author

@EricMichaelSmith @ It seems I have a warning "1 workflow awaiting approval" since i'm a first-time contributor. Also the long_gpu_tests failes. Any suggestions?

Yeah, I think one of us has to approve running the CI checks. The long_gpu_tests check fails with a sqlite3.OperationalError: database or disk is full error, which is weird - can you try doing a git merge main on this PR to see if maybe this error is something that's been fixed on main?

I did the merge. seems not helpful.

@EricMichaelSmith
Copy link
Contributor

@EricMichaelSmith @ It seems I have a warning "1 workflow awaiting approval" since i'm a first-time contributor. Also the long_gpu_tests failes. Any suggestions?

Yeah, I think one of us has to approve running the CI checks. The long_gpu_tests check fails with a sqlite3.OperationalError: database or disk is full error, which is weird - can you try doing a git merge main on this PR to see if maybe this error is something that's been fixed on main?

I did the merge. seems not helpful.

@stephenroller did you ever see this error when debugging CI checks? It looks to come from downloading http://parl.ai/downloads/_models/wikipedia_full/tfidf_retriever/model.tgz , nothing to do with this PR itself

@zzhangncsu
Copy link
Contributor Author

@EricMichaelSmith Shall we merge the PR? :-)

@EricMichaelSmith
Copy link
Contributor

EricMichaelSmith commented Jul 20, 2022

@EricMichaelSmith Shall we merge the PR? :-)

Hmm, assuming that the long_gpu_tests PR is still the only one that fails, and assuming that it still fails given the error above, yes, I'd say that you're fine to merge now, given that we now see this same error in main :/

@zzhangncsu
Copy link
Contributor Author

@EricMichaelSmith Shall we merge the PR? :-)

Hmm, assuming that the long_gpu_tests PR is still the only one that fails, and assuming that it still fails given the error above, yes, I'd say that you're fine to merge now, given that we now see this same error in main :/

Sounds good. I don't have the write permission though. Please feel free to merge it. Thanks!

@EricMichaelSmith EricMichaelSmith merged commit 73a395f into facebookresearch:main Jul 25, 2022
@EricMichaelSmith
Copy link
Contributor

@EricMichaelSmith Shall we merge the PR? :-)

Hmm, assuming that the long_gpu_tests PR is still the only one that fails, and assuming that it still fails given the error above, yes, I'd say that you're fine to merge now, given that we now see this same error in main :/

Sounds good. I don't have the write permission though. Please feel free to merge it. Thanks!

Just merged. Thanks again for doing this!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants