MMCBench: Benchmarking Large Multimodal Models against Common Corruptions 🚀

Code for the paper Benchmarking Large Multimodal Models against Common Corruptions.

Overview

MMCBench is a comprehensive benchmarking framework designed to evaluate the robustness and self-consistency of Large Multimodal Models (LMMs) under common corruption scenarios. This framework focuses on cross-modal interactions involving text, image, and speech, covering essential generative tasks such as text-to-image, image-to-text, text-to-speech, and speech-to-text. Our benchmarking approach uses a novel methodology for selecting representative examples from large datasets and employs a consistent metric system for performance measurement across various cross-modalities.

Benchmarking Process 📈

The selection and evaluation process for cross-modality consistency in MMCBench involves two main steps:

Selection Process 🕵️‍♂️: This step involves determining similarity based on text modality, using model-generated captions or transcriptions for non-text inputs, and directly comparing text inputs before and after corruption.
Evaluation Process 📝: This step measures self-consistency by comparing clean inputs with outputs from corrupted inputs and comparing outputs from clean and corrupted inputs against each other.

Overview of the Selection and Evaluation Process 📌

Model Resilience Analysis 🛡️

We present radar charts depicting the relative consistency scores of selected models for various corruptions across four cross-modality tasks: text-to-image 🎨, image-to-text 📜, text-to-speech 🗣️, and speech-to-text 📝. The scores are normalized with the highest scoring model set as the baseline for each type of corruption, allowing for a comparative analysis of each model's resilience.

Radar Charts of Model Consistency Scores 🎯

Repository Structure 📂

MMCBench/
- image2text/: Image-to-Text generation tasks.
- speech2text/: Speech-to-Text generation tasks.
- text2image/: Text-to-Image generation tasks.
- text2speech/: Text-to-Speech generation tasks.

Environment Setup 🌐

To set up the environment for running MMCBench, we recommend using Conda, which can handle packages and dependencies effectively. Follow these steps to create and activate a Conda environment:

Create a Conda Environment: Open your terminal and run the following command to create a new environment named mmcbench_env:
```
conda create -n mmcbench python=3.9
```
Activate the Environment: Activate the newly created environment:
```
conda activate mmcbench
```
Install Required Packages: Install all necessary packages using the requirements.txt file included in the repository:
```
pip install -r requirements.txt
```

Getting Started 🚦

To begin using MMCBench, clone this repository and follow the setup instructions in each module. Detailed documentation for each step of the benchmarking process is provided. All the related corrupted data is available on Hugging Face.

Contributions 👐

MMCBench is an open-source project, and contributions are welcome. If you wish to contribute, please submit a pull request or open an issue to discuss your proposed changes.

License 📄

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Acknowledgments 🎉

We thank all contributors and participants who have made MMCBench a comprehensive benchmark for evaluating large multimodal models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MMCBench: Benchmarking Large Multimodal Models against Common Corruptions 🚀

Overview

Benchmarking Process 📈

Overview of the Selection and Evaluation Process 📌

Model Resilience Analysis 🛡️

Radar Charts of Model Consistency Scores 🎯

Repository Structure 📂

Environment Setup 🌐

Getting Started 🚦

Contributions 👐

License 📄

Acknowledgments 🎉

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
figs		figs
image2text		image2text
speech2text		speech2text
text2image		text2image
text2speech		text2speech
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

sail-sg/MMCBench

Folders and files

Latest commit

History

Repository files navigation

MMCBench: Benchmarking Large Multimodal Models against Common Corruptions 🚀

Overview

Benchmarking Process 📈

Overview of the Selection and Evaluation Process 📌

Model Resilience Analysis 🛡️

Radar Charts of Model Consistency Scores 🎯

Repository Structure 📂

Environment Setup 🌐

Getting Started 🚦

Contributions 👐

License 📄

Acknowledgments 🎉

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages