Skip to content

Commit

Permalink
update hf datasets info
Browse files Browse the repository at this point in the history
  • Loading branch information
xyang0 committed Jul 11, 2024
1 parent c0a92b8 commit f2435ae
Show file tree
Hide file tree
Showing 5 changed files with 52 additions and 61 deletions.
112 changes: 52 additions & 60 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,83 +1,72 @@
# [MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning](https://arxiv.org/abs/2311.10774)

# MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning [[NAACL 2024]](https://2024.naacl.org)
<div align="center">
<a href="https://arxiv.org/abs/2311.10774"><img src="https://img.shields.io/badge/Paper-arXiv-red" alt="arXiv"></a>
<a href="https://huggingface.co/datasets/xywang1/MMC"><img src="https://img.shields.io/badge/Dataset-%F0%9F%A4%97%20Hugging_Face-yellow" alt="Hugging Face"></a>
<a href="https://aclanthology.org/2024.naacl-long.70"><img src="https://img.shields.io/badge/NAACL-2024-blue" alt="NAACL 2024"></a>
</div>

This is the PyTorch implementation of the paper ***MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning***, the paper is available at https://arxiv.org/abs/2311.10774.
This is the official GitHub repo of the paper [MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning](https://arxiv.org/abs/2311.10774).

# Contact
## News

If you have any questions about this work, please email Fuxiao Liu [fl3es@umd.edu](fl3es@umd.edu).
- [Jul. 9, 2024] 🔥🔥🔥 Our dataset is now released through [Hugging Face Datasets](https://huggingface.co/datasets/xywang1/MMC).
- [Mar. 13, 2024] Our paper is accepted to [NAACL 2024](https://aclanthology.org/2024.naacl-long.70).
- [Nov. 15, 2023] Our paper is available on [arXiv](https://arxiv.org/abs/2311.10774).

# Note
## Highlights

- We introduce a large-scale MultiModal Chart Instruction (**MMC-Instruction**) dataset supporting diverse tasks and chart types. Leveraging this data.
- We also propose a Multi-Modal Chart Benchmark (**MMC-Benchmark**), a comprehensive human-annotated benchmark with nine distinct tasks evaluating reasoning capabilities over charts. Extensive experiments on MMC-Benchmark reveal the limitations of existing LMMs on correctly interpreting charts, even for the most recent GPT-4V model.
- We develop Multi-Modal Chart Assistant (MMCA), an LMM that achieves state-of-the-art performance on existing chart QA benchmarks.
- We also propose a Multi-Modal Chart Benchmark (**MMC-Benchmark**), a comprehensive human-annotated benchmark with nine distinct tasks evaluating reasoning capabilities over charts. Extensive experiments on MMC-Benchmark reveal the limitations of existing LMMs on correctly interpreting charts, even for the most recent GPT-4V model.

<p align="center"><img src="./images/WechatIMG6440.jpg" width="100%"></a> <br>
</p>
<div align="center">
<img src="./images/overview.png" width="90%">
</div>

# News
## Data Release

- [03/13]🔥 Our paper ["MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning"](https://arxiv.org/pdf/2311.10774.pdf) is accepted to **[NAACL 2024](https://2024.naacl.org)**.
- [02/26]🔥 Our paper ["HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models"](https://arxiv.org/abs/2310.14566) is accpeted to **[CVPR 2024](https://cvpr.thecvf.com)**.
- [01/15]🔥 Our paper [Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning](http://arxiv.org/abs/2306.14565) is accepted by **[ICLR 2024](https://iclr.cc)**

## MMC-Alignment Dataset
### Scientific(Arxiv) Chart-Caption
The chart-text alignment data (MMC-Alignment), chart instruction-tuning data (MMC-Instruction), and benchmark data (MMC-Benchmark) introduced in our [paper](https://arxiv.org/abs/2311.10774) can be downloaded from [Hugging Face Datasets](https://huggingface.co/datasets/xywang1/MMC) using git clone:
```
#Images
gdown https://drive.google.com/file/d/1pDGOkE8AfVTCdy9-5yTltvGlYzgw0wyk
#Text
gdown https://drive.google.com/file/d/1l7Ft4xIl9fvSuxNQ-QPo0hLP7jNx6Mnt
git lfs install
git clone https://huggingface.co/datasets/xywang1/MMC
```

### Existing Datasets
It contains three sub-directories MMC-Alignment, MMC-Benchmark, and MMC-Instruction:

We select the data with the chart summarization task from Source: Statist, PlotQA, [VisText](https://github.com/mitvis/vistext), ChartInfo, Unichart. Please see details in our [paper](https://arxiv.org/abs/2311.10774).
### MMC-Alignment

```
#Images
gdown https://drive.google.com/file/d/1e1mx_nb5PWjPkuIsJkY8B4xSET9DOWTa
#Text
gdown https://drive.google.com/file/d/18SJ13V4qEt1ixOQPbRmEnZKQrjS5v14T
```
- mmc_chart_text_alignment_arxiv_text.jsonl: 250,000 samples for chart-text alignment training.
- mmc_chart_text_alignment_arxiv_images.tar.gz: images for mmc_chart_text_alignment_arxiv_text.jsonl.

## MMC-Instruction Dataset
### Non-arxiv
```
#Part 1
#Images
gdown https://drive.google.com/file/d/1Y17wNYdBlPxhB5KKiux2BD8C2FlA5MC9
#Text
gdown https://drive.google.com/file/d/1tUtntLRgsBJ9v5NcdTMvVI32ruLHAyFe
#Part2
#Images
gdown https://drive.google.com/uc?id=1Dey-undzW2Nl21CYLFSkP_Y4RrfRJkYd
#Text
gdown https://drive.google.com/uc?id=13j2U-ectsYGR92r6J5hPdhT8T5ezItHF
#Part3
#Images
gdown https://drive.google.com/file/d/1W8sQ6fLkOm8bZo_SrRIiIGELS90aNKq7
#Text
gdown https://drive.google.com/file/d/1o3Edkf6bdyZloe_FSth8wtNmkFf_Bda_
```
### arxiv
```
#Images
gdown https://drive.google.com/file/d/1QKQOMIH9Wd3EQEYr3IV2u9qQXmSZy49G
#Text
gdown https://drive.google.com/file/d/1PI1EdgPk-gBi29adk25cjGtwskyBjcN9
```
### MMC-Benchmark

- mmc_benchmark_text.jsonl: 2,126 instances for testing and benchmarking.
- mmc_benchmark_images.tar.gz: images for mmc_benchmark_text.jsonl.

## MMC-Benchmark
### Images
### MMC-Instruction

- mmc_instruction_arxiv_text.jsonl: 300,000 question-answer pairs synthesized with arXiv data for instruction tuning.
- mmc_instruction_arxiv_images.tar.gz: images for mmc_instruction_arxiv_text.jsonl.
- mmc_instruction_non-arxiv_text.jsonl: 110,020 extra question-answer pairs for instruction tuning.
- mmc_instruction_non-arxiv_images.tar.gz: images for mmc_instruction_non-arxiv_text.jsonl.

## Existing Datasets

As mentioned in the [paper](https://arxiv.org/abs/2311.10774), chart summarization datasets from Statist, PlotQA, [VisText](https://github.com/mitvis/vistext), ChartInfo, and Unichart are used in our experiments for chart-text alignment training. Please refer to the following script for details:
```
gdown https://drive.google.com/file/d/19CA-AFKshOVEOabK3-pkkZfjbtaJFP7Y
# Existing chart-text alignment images
gdown https://drive.google.com/uc?id=1e1mx_nb5PWjPkuIsJkY8B4xSET9DOWTa
# Existing chart-text alignment text
gdown https://drive.google.com/uc?id=18SJ13V4qEt1ixOQPbRmEnZKQrjS5v14T
```
### Questions and Answers

For existing Chart QA training data, please refer to the following script:
```
gdown https://drive.google.com/file/d/1HOVhPuFJ0roaHt-6AFyYX2E5MxKjoFug
# Existing chart qa images
gdown https://drive.google.com/uc?id=1Y17wNYdBlPxhB5KKiux2BD8C2FlA5MC9
# Existing chart qa text
gdown https://drive.google.com/uc?id=1tUtntLRgsBJ9v5NcdTMvVI32ruLHAyFe
```

## MMCA Gradio demo
Expand Down Expand Up @@ -123,6 +112,10 @@ When you launch the demo in local machine, you might find there is no space for
python -m serve.web_server --base-model 'the mplug-owl checkpoint directory' --bf16
```

## Contact

If you have any questions about this work, please email Fuxiao Liu [fl3es@umd.edu](fl3es@umd.edu).

## Citation

```
Expand All @@ -134,7 +127,6 @@ python -m serve.web_server --base-model 'the mplug-owl checkpoint directory' --b
}
```

# Disclaimer
## Disclaimer

We develop this repository for RESEARCH purposes, so it can only be used for personal/research/non-commercial purposes.

Binary file removed images/WechatIMG6440.jpg
Binary file not shown.
File renamed without changes
1 change: 0 additions & 1 deletion images/init

This file was deleted.

Binary file added images/overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit f2435ae

Please sign in to comment.