-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make qwen2 attention_qkv_bias
optional
#32893
base: main
Are you sure you want to change the base?
Conversation
- compatibility w/ llama model
I have no idea why the check_repo test failed with the error messages above. Can you please give me some guides? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey! We usually do this when there is actually a model that is published and has this config option set / unset.
attention_bias
for llama for example
@ArthurZucker Thank you for your response! |
@ArthurZucker Hey, do you have any further comments for this? |
Hey! The reason we added that param for llama is for the release of #26302: InternLM was officially using this flag, meaning there is a model related to this change! |
What does this PR do?
Qwen2 and Qwen2-MoE model is forced to add bias to the query, key and value linear projections.
However, following the trend with other recent models (e.g. llama), I refactored these
attention_qkv_bias
to be optional so that we can configure it in config file.Fixes #32892
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@ArthurZucker @stevhliu