Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Add support for any GPT-2 model hosted in Huggingface #4360

Merged
merged 3 commits into from
Feb 17, 2022

Conversation

cahya-wirawan
Copy link
Contributor

Patch description
As mentioned in #4357, the current Huggingface agent only supports few english GPT2 models (small, medium, large, xl and distilgpt2), therefore I add the functionality to specify arbitrary GPT-2 model hosted in Huggingface such as anonymous-german-nlp/german-gpt2 or indonesian-nlp/gpt2.

Testing steps
Following is an example to fine tune indonesian-nlp/gpt2 model with convai2 dataset:

parlai train_model -m hugging_face/gpt2 --model_name indonesian-nlp/gpt2 --add-special-tokens True --add-start-token True -t convai2 -bs 2 -mf <modelfile>

@facebook-github-bot
Copy link

Hi @cahya-wirawan!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@klshuster klshuster self-requested a review February 15, 2022 21:03
agent.add_argument(
"--model_name",
type=str,
default=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we provide a default here that would fall back to the existing behavior?

Copy link
Contributor Author

@cahya-wirawan cahya-wirawan Feb 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it is already the case. If we don't use the argument "--model_name" at all, it will fall back to the old behavior. This is because the default value for this argument is None

Copy link
Contributor

@stephenroller stephenroller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! seems like a nice extension. Flagging for @jxmsML to also review please.

parlai/agents/hugging_face/gpt2.py Outdated Show resolved Hide resolved
Co-authored-by: Stephen Roller <roller@fb.com>
@cahya-wirawan
Copy link
Contributor Author

cahya-wirawan commented Feb 17, 2022

Hi @stephenroller , sorry I am not familiar with PR procedure of this repo yet, I just don't know about the error in ci/circleci: unittests_38 above/below, should I create a unit test for the additional argument? and if yes, where should I put it? Thanks.

@stephenroller
Copy link
Contributor

Our CircleCI is broken right now. I think this PR is fine.

@stephenroller
Copy link
Contributor

Last thing: can you run "black" on the code?

@cahya-wirawan
Copy link
Contributor Author

I did it before, but here again:
% black parlai/agents/hugging_face
All done! ✨ 🍰 ✨
6 files left unchanged.

@stephenroller stephenroller merged commit 31e049d into facebookresearch:main Feb 17, 2022
@stephenroller
Copy link
Contributor

Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants