Skip to content

Files

Latest commit

2d57e21 · Dec 25, 2024

History

History

eval

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Dec 25, 2024
Nov 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024
Dec 25, 2024

README for Evaluation

Here, we list the codebase we used to obtain the evaluation results in the InternVL 2.5 technical report.

Multimodal Reasoning and Mathematics

Benchmark Name Codebase
MMMU VLMEvalKit
MMMU-Pro This Codebase
MathVista VLMEvalKit
MATH-Vision VLMEvalKit
MathVerse VLMEvalKit
OlympiadBench VLMEvalKit

Multimodal Reasoning and Mathematics

Benchmark Name Codebase
AI2D with mask This Codebase
AI2D without mask VLMEvalKit
ChartQA This Codebase
DocVQA This Codebase
InfoVQA This Codebase
OCRBench VLMEvalKit
SEED-2-Plus VLMEvalKit
CharXiv CharXiv
VCR VLMEvalKit

Multi-Image Understanding

Benchmark Name Codebase
BLINK VLMEvalKit
Mantis Eval This Codebase
MMIU This Codebase
MuirBench VLMEvalKit
MMT-Bench VLMEvalKit
MIRB This Codebase

Real-World Comprehension

Benchmark Name Codebase
RealWorldQA VLMEvalKit
MME-RealWorld VLMEvalKit
WildVision VLMEvalKit
R-Bench VLMEvalKit

Comprehensive Multimodal Evaluation

Benchmark Name Codebase
MME This Codebase
MMBench This Codebase
MMBench v1.1 VLMEvalKit
MMVet VLMEvalKit
MMVet v2 This Codebase
MMStar VLMEvalKit

Multimodal Hallucination Evaluation

Benchmark Name Codebase
HallBench VLMEvalKit
MMHal-Bench This Codebase
CRPE VLMEvalKit
POPE This Codebase

Visual Grounding

Benchmark Name Codebase
RefCOCO This Codebase
RefCOCO+ This Codebase
RefCOCOg This Codebase

Multimodal Multilingual Understanding

Benchmark Name Codebase
MMMB VLMEvalKit
Multilingual MMBench VLMEvalKit
MTVQA VLMEvalKit

Video Understanding

Benchmark Name Codebase
Video-MME VLMEvalKit
MVBench This Codebase
MMBench-Video VLMEvalKit
MLVU VLMEvalKit
LongVideoBench VLMEvalKit
CG-Bench provided by authors