Plz fix run_vila.py line 65 output variable(s) #127

ziyaosg · 2024-08-25T21:26:54Z

Hi!

Directly running

python -W ignore llava/eval/run_vila.py \
    --model-path Efficient-Large-Model/VILA1.5-40b \
    --conv-mode hermes-2 \
    --query "<video>\n Please describe this video." \
    --video-file "demo.mp4"

gives ValueError: too many values to unpack (expected 2) for line 65 in run_vila.py:

images, num_frames = opencv_extract_frames(video_file, args.num_video_frames).

After closer inspection, I realized only the images are outputted by this function, not the number of frames. So simply changing this line to

images = opencv_extract_frames(video_file, args.num_video_frames)

fixed the issue on my end.

Please help check if this is the expected behavior and fix the output if necessary. Thank you!

The text was updated successfully, but these errors were encountered:

Lyken17 · 2025-02-25T09:38:23Z

now we recommend to use vila-infer to run the inference

vila-infer \
    --model-path Efficient-Large-Model/NVILA-15B \
    --conv-mode auto \
    --text "Please describe the video" \
    --media https://huggingface.co/datasets/Efficient-Large-Model/VILA-inference-demos/resolve/main/OAI-sora-tokyo-walk.mp4

gheinrich pushed a commit to gheinrich/VILA that referenced this issue Dec 16, 2024

fix dataset name (NVlabs#127)

e7677f4

Lyken17 closed this as completed Feb 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plz fix run_vila.py line 65 output variable(s) #127

Plz fix run_vila.py line 65 output variable(s) #127

ziyaosg commented Aug 25, 2024 •

edited

Loading

Lyken17 commented Feb 25, 2025

Plz fix run_vila.py line 65 output variable(s) #127

Plz fix run_vila.py line 65 output variable(s) #127

Comments

ziyaosg commented Aug 25, 2024 • edited Loading

Lyken17 commented Feb 25, 2025

ziyaosg commented Aug 25, 2024 •

edited

Loading