Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Benchmark] Support MATH-Vision #292

Merged
merged 6 commits into from
Jul 19, 2024
Merged

Conversation

scikkk
Copy link
Contributor

@scikkk scikkk commented Jul 18, 2024

Measuring Multimodal Mathematical Reasoning with the MATH-Vision🔥 Dataset

MathQA
Mathematical Reasoning
Multimodal Reasoning

ChatGPT
GPT-4
GPT-4V
GPT-4V
Gemini

[🌐 Homepage] [🤗 Huggingface Dataset] [📊 Leaderboard ] [🔍 Visualization] [📖 ArXiv Paper]

👀 Introduction

Recent advancements in Large Multimodal Models (LMMs) have shown promising results in mathematical reasoning within visual contexts, with models approaching human-level performance on existing benchmarks such as MathVista. However, we observe significant limitations in the diversity of questions and breadth of subjects covered by these benchmarks. To address this issue, we present the MATH-Vision (MATH-V) dataset, a meticulously curated collection of 3,040 high-quality mathematical problems with visual contexts sourced from real math competitions. Spanning 16 distinct mathematical disciplines and graded across 5 levels of difficulty, our dataset provides a comprehensive and diverse set of challenges for evaluating the mathematical reasoning abilities of LMMs.

📈 Evaluation

# MATH-Vision
torchrun --nproc-per-node=1  run.py --data MATH_V --model your_model --verbose

# MATH-Vision tesimini
torchrun --nproc-per-node=1  run.py --data MATH_V_MINI --model your_model --verbose

@kennymckormick kennymckormick merged commit 24f7def into open-compass:main Jul 19, 2024
1 check passed
shan23chen pushed a commit to shan23chen/VLMEvalKit that referenced this pull request Oct 3, 2024
* [Benchmark] Support MATH-Vision

* update url

* Fix download_file

* update MATH_V md5

* fix MathVision

* fix lint

---------

Co-authored-by: Ke Wang <wangk.gm@gmail.com>
Co-authored-by: kennymckormick <dhd@pku.edu.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants