-
-
Notifications
You must be signed in to change notification settings - Fork 11.2k
[Model] Use merge_by_field_config for MM models (O-P) #26776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model] Use merge_by_field_config for MM models (O-P) #26776
Conversation
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors several multi-modal models (Phi3V, Phi4Multimodal, Phi4MM) to use the merge_by_field_config = True flag. This is a good simplification, as it centralizes input batching logic and allows for the removal of redundant input flattening code within each model.
However, I've identified a critical regression in the Phi3VForCausalLM model. The refactoring removed the logic for handling pre-computed image embeddings (image_embeds) in _process_image_input, but the code path to accept them still exists. This will result in a KeyError when a user provides pre-computed embeddings. I've provided a suggested fix to restore this functionality while being compatible with the new input handling mechanism.
Edit: I see the problem now. Fixing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…6776) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>
…6776) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: bbartels <benjamin@bartels.dev>
…6776) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…6776) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…6776) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
…6776) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
…6776) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
…6776) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
…6776) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Purpose
Part of #26149
Test Plan
I have run the example scripts on each model.
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.