add av_odyssey bench #461

Bohao-Lee · 2024-12-15T08:49:16Z

We introduce the av_odyssey benchmark, designed to evaluate a model's audio-visual capabilities.

Currently, we provide an implementation based on the Gemini model. The corresponding command to run the evaluation is as follows:

accelerate launch --num_processes 8 --main_process_port 12345 -m lmms_eval --model gemini_api --tasks av_odyssey --batch_size 1  --log_samples_suffix av_odyssey --output_path ./logs/

To support this evaluation, we made modifications to the gemini_api.py file located in the models directory.

Modified the input format

add av_odyssey bench

8bd9124

pufanyi self-requested a review December 15, 2024 08:51

pufanyi and others added 3 commits December 18, 2024 00:15

av_odyssey

6e4c7e9

lint

dddd219

Merge pull request #2 from pufanyi/feature/av_odyssey

78c4186

Modified the input format

pufanyi approved these changes Dec 22, 2024

View reviewed changes

pufanyi merged commit db60d2b into EvolvingLMMs-Lab:main Dec 22, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add av_odyssey bench #461

add av_odyssey bench #461

Bohao-Lee commented Dec 15, 2024

add av_odyssey bench #461

add av_odyssey bench #461

Conversation

Bohao-Lee commented Dec 15, 2024