We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi!
Directly running
python -W ignore llava/eval/run_vila.py \ --model-path Efficient-Large-Model/VILA1.5-40b \ --conv-mode hermes-2 \ --query "<video>\n Please describe this video." \ --video-file "demo.mp4"
gives ValueError: too many values to unpack (expected 2) for line 65 in run_vila.py:
ValueError: too many values to unpack (expected 2)
images, num_frames = opencv_extract_frames(video_file, args.num_video_frames).
images, num_frames = opencv_extract_frames(video_file, args.num_video_frames)
After closer inspection, I realized only the images are outputted by this function, not the number of frames. So simply changing this line to
images = opencv_extract_frames(video_file, args.num_video_frames)
fixed the issue on my end.
Please help check if this is the expected behavior and fix the output if necessary. Thank you!
The text was updated successfully, but these errors were encountered:
fix dataset name (NVlabs#127)
e7677f4
now we recommend to use vila-infer to run the inference
vila-infer
vila-infer \ --model-path Efficient-Large-Model/NVILA-15B \ --conv-mode auto \ --text "Please describe the video" \ --media https://huggingface.co/datasets/Efficient-Large-Model/VILA-inference-demos/resolve/main/OAI-sora-tokyo-walk.mp4
Sorry, something went wrong.
No branches or pull requests
Hi!
Directly running
gives
ValueError: too many values to unpack (expected 2)
for line 65 in run_vila.py:images, num_frames = opencv_extract_frames(video_file, args.num_video_frames)
.After closer inspection, I realized only the images are outputted by this function, not the number of frames. So simply changing this line to
images = opencv_extract_frames(video_file, args.num_video_frames)
fixed the issue on my end.
Please help check if this is the expected behavior and fix the output if necessary. Thank you!
The text was updated successfully, but these errors were encountered: