-
-
Notifications
You must be signed in to change notification settings - Fork 11.3k
[Bugfix] Fix mrope in Transformers Backend #26087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request fixes an mrope issue in the Transformers backend by correctly configuring image_grid_thw and video_grid_thw as batched fields. It also includes several refactorings and cleanups, such as using a public API for setting attention implementation and removing unused code. The changes are generally good, but I've identified a critical issue where unguarded dictionary access could lead to a KeyError when processing text-only inputs in a multimodal model. I've provided a suggestion to make the code more robust.
Signed-off-by: raushan <raushan@huggingface.co>
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
This pull request has merge conflicts that must be resolved before it can be |
hmellor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Let's see what CI thinks
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: raushan <raushan@huggingface.co> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Karan Goel <3261985+karan@users.noreply.github.com>
Signed-off-by: raushan <raushan@huggingface.co> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: raushan <raushan@huggingface.co> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: raushan <raushan@huggingface.co> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: raushan <raushan@huggingface.co> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: raushan <raushan@huggingface.co> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: raushan <raushan@huggingface.co> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Fixes the recently skipped test by creating
image_grid_thwas batched fields. Otherwise they get an extra dimension and fail when preparing mrope positions in vLLM's model runnersAlso, adds/rewrites some comments in processor code to keep them up to date. I tried to follow the base class
apply()method, but transformers unfortunately cannot split the logic forplaceholderandthe restinto two. It will force us to call transformer utilities twice which is not very fast. So I am keeping it as is and simply adding commentscc @hmellor as we have been talking about it internally