-
-
Notifications
You must be signed in to change notification settings - Fork 11k
[Misc] Modify CacheConfig import #23459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Misc] Modify CacheConfig import #23459
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly resolves an ImportError for CacheConfig in vllm/attention/layers/encoder_only_attention.py. The change updates the import path from transformers to vllm.config, which is the correct source for this class as used by the Attention superclass. The fix is accurate and necessary to resolve the described runtime error.
|
Sorry for that... But quite strange that type checker failed to catch this problem. |
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Purpose
When using the latest main branch of
transformers, encounter the following import errorIn addition, based on the parameters of the Attention,
CacheConfigshould come fromvllm.config.Test Plan
Test Result
(Optional) Documentation Update
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.