Here, we list the codebase we used to obtain the evaluation results in the InternVL 2.5 technical report.
Benchmark Name | Codebase |
---|---|
MMMU | VLMEvalKit |
MMMU-Pro | This Codebase |
MathVista | VLMEvalKit |
MATH-Vision | VLMEvalKit |
MathVerse | VLMEvalKit |
OlympiadBench | VLMEvalKit |
Benchmark Name | Codebase |
---|---|
AI2D with mask | This Codebase |
AI2D without mask | VLMEvalKit |
ChartQA | This Codebase |
DocVQA | This Codebase |
InfoVQA | This Codebase |
OCRBench | VLMEvalKit |
SEED-2-Plus | VLMEvalKit |
CharXiv | CharXiv |
VCR | VLMEvalKit |
Benchmark Name | Codebase |
---|---|
BLINK | VLMEvalKit |
Mantis Eval | This Codebase |
MMIU | This Codebase |
MuirBench | VLMEvalKit |
MMT-Bench | VLMEvalKit |
MIRB | This Codebase |
Benchmark Name | Codebase |
---|---|
RealWorldQA | VLMEvalKit |
MME-RealWorld | VLMEvalKit |
WildVision | VLMEvalKit |
R-Bench | VLMEvalKit |
Benchmark Name | Codebase |
---|---|
MME | This Codebase |
MMBench | This Codebase |
MMBench v1.1 | VLMEvalKit |
MMVet | VLMEvalKit |
MMVet v2 | This Codebase |
MMStar | VLMEvalKit |
Benchmark Name | Codebase |
---|---|
HallBench | VLMEvalKit |
MMHal-Bench | This Codebase |
CRPE | VLMEvalKit |
POPE | This Codebase |
Benchmark Name | Codebase |
---|---|
RefCOCO | This Codebase |
RefCOCO+ | This Codebase |
RefCOCOg | This Codebase |
Benchmark Name | Codebase |
---|---|
MMMB | VLMEvalKit |
Multilingual MMBench | VLMEvalKit |
MTVQA | VLMEvalKit |
Benchmark Name | Codebase |
---|---|
Video-MME | VLMEvalKit |
MVBench | This Codebase |
MMBench-Video | VLMEvalKit |
MLVU | VLMEvalKit |
LongVideoBench | VLMEvalKit |
CG-Bench | provided by authors |