-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add support for any GPT-2 model hosted in Huggingface #4360
Conversation
Hi @cahya-wirawan! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks! |
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
agent.add_argument( | ||
"--model_name", | ||
type=str, | ||
default=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we provide a default here that would fall back to the existing behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it is already the case. If we don't use the argument "--model_name" at all, it will fall back to the old behavior. This is because the default value for this argument is None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! seems like a nice extension. Flagging for @jxmsML to also review please.
Co-authored-by: Stephen Roller <roller@fb.com>
Hi @stephenroller , sorry I am not familiar with PR procedure of this repo yet, I just don't know about the error in ci/circleci: unittests_38 above/below, should I create a unit test for the additional argument? and if yes, where should I put it? Thanks. |
Our CircleCI is broken right now. I think this PR is fine. |
Last thing: can you run "black" on the code? |
I did it before, but here again: |
Thanks! |
Patch description
As mentioned in #4357, the current Huggingface agent only supports few english GPT2 models (small, medium, large, xl and distilgpt2), therefore I add the functionality to specify arbitrary GPT-2 model hosted in Huggingface such as anonymous-german-nlp/german-gpt2 or indonesian-nlp/gpt2.
Testing steps
Following is an example to fine tune indonesian-nlp/gpt2 model with convai2 dataset: