-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StackLlama: fixed RL training and added args #400
Conversation
added steps argument and break to respect max training epochs added more PPOConfig args to script args removed llama tokenizer hacks removed extra args in dataset changed to llamatokenizer from autotokenizer black + isort
You can use |
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THank you so much for your contribution! Could you just run the styling checks? After that we should be good for merging
make style && make quality
Fixed style and quality and Switched back to Autotokenizer, thanks for the tip @ArthurZucker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much!
added steps argument and break to respect max training epochs added more PPOConfig args to script args
removed llama tokenizer hacks
black + isort
switched toaddedLlamaTokenizer
fromAutoTokenizer
return_token_type_ids=False
to pipeline kwargs becauseLlamaTokenizerFast
LlamaTokenizerFast
will outputtoken_type_ids
see LLaMATokenizerFast works abnormally transformers#23818 and 🚨🚨 🚨🚨 [Tokenizer
] attemp to fix add_token issues🚨🚨 🚨🚨 transformers#23909token_type_ids
cause an error in our reward modelpipeline
, namelyTypeError: LlamaForSequenceClassification.forward() got an unexpected keyword argument 'token_type_ids'