-
Notifications
You must be signed in to change notification settings - Fork 736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add multimodal (audio, image and video) analysis toolkits #1496
base: master
Are you sure you want to change the base?
Conversation
video_toolkit_test.py
Outdated
with exactly the name of the species without any additional text." | ||
|
||
answer = test_toolkit.ask_question_about_video(video_path, question) | ||
print(f"\nThe final answer is: {answer}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will the current toolkit pass these unit tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Mengkang, I ran the test a few more times and discovered that although it could get the correct number for the first question, it is not returning in the right format hence the answer is likely a hallucination. I am working on fixing this problem and will finish today. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! Changed the vl model to Qwen VL Max again and now it can answer all three questions correctly. However, a major problem is that Qwen VL Max could only support passing 28 images simultaneously (from my experiment), which may cause significant information loss for long videos. I am adding a key_frame_extraction method to try to cope with this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @harryeqs , some test and pre-commit check seems not passed, could you fix this? thanks!
Thanks @Wendong-Fan ! Sorry I am busy working on some feature on the Eigentbot at the moment, will fix the test and add unit tests as soon as possible. |
Description
Describe your changes in detail.
Motivation and Context
Why is this change required? What problem does it solve?
If it fixes an open issue, please link to the issue here.
You can use the syntax
close #15213
if this solves the issue #15213Types of changes
What types of changes does your code introduce? Put an
x
in all the boxes that apply:Implemented Tasks
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!