-
-
Notifications
You must be signed in to change notification settings - Fork 11.3k
[model] support qwen2audio embedding input #23625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[model] support qwen2audio embedding input #23625
Conversation
Signed-off-by: Yuekai Zhang <zhangyuekai@foxmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for providing pre-computed audio embeddings as input to Qwen2-Audio models. The changes correctly introduce new data structures and parsing logic to handle embeddings alongside raw audio features. However, I've identified two critical issues in the implementation. The first is a KeyError that will occur when accessing audio embeddings due to using an incorrect variable. The second is a logical error in how lists of audio embeddings are processed, where they are incorrectly concatenated, losing the essential separation between embeddings from different audio sources. I have provided code suggestions to fix both of these critical bugs.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Yuekai Zhang <zhangyuekai@foxmail.com>
DarkLight1337
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM
|
PTAL at the failing test |
Signed-off-by: Yuekai Zhang <zhangyuekai@foxmail.com>
Head branch was pushed to by a user without write access
Signed-off-by: Yuekai Zhang <zhangyuekai@foxmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: tc-mb <caitianchi@modelbest.cn>
Signed-off-by: Yuekai Zhang <zhangyuekai@foxmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Yuekai Zhang <zhangyuekai@foxmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>
Signed-off-by: Yuekai Zhang <zhangyuekai@foxmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Yuekai Zhang <zhangyuekai@foxmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Yuekai Zhang <zhangyuekai@foxmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
This PR supports qwen2audio models to process audio_embeddings as the inputs.