MM_Robustness

Journal of Data-centric Machine Learning Research (DMLR)

Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift

More details can be found on the project webpage.

The code for generating multimodal robustness evaluation datasets for downstream image-text applications, including image-text retrieval, visual reasoning, visual entailment, image captioning, and text-to-image generation.

Citation

If you feel our code or models help your research, kindly cite our papers:

@inproceedings{Qiu2022BenchmarkingRO,
  title={Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift},
  author={Jielin Qiu and Yi Zhu and Xingjian Shi and F. Wenzel and Zhiqiang Tang and Ding Zhao and Bo Li and Mu Li},
  journal={Journal of Data-centric Machine Learning Research (DMLR)},
  year={2024}
}

Installation

./install.sh

Datasets

The original datasets can be downloaded from the original website:
- Flickr30K: https://shannon.cs.illinois.edu/DenotationGraph/
- COCO: https://cocodataset.org/#home
- NLVR2: https://lil.nlp.cornell.edu/nlvr/
- SNLI-VE: https://github.com/necla-ml/SNLI-VE

Generate perturbation datasets

For image perturbation, please see image_perturbation
For text perturbation, please see text_perturbation
For detection score, please see detection_score

Evaluation data for text-to-image generation

For the text-to-image generation evaluation, we used the captions from COCO as prompt to generate the corresponding images. We also share the generated images here.

Baselines

For the evaluated baselines, plase see evaluated_baselines

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
detection_score		detection_score
evaluated_baselines		evaluated_baselines
image_perturbation		image_perturbation
original_annotation		original_annotation
text_perturbation		text_perturbation
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
THIRD-PARTY-LICENSES.txt		THIRD-PARTY-LICENSES.txt
install.sh		install.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MM_Robustness

Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift

Citation

Installation

Datasets

Generate perturbation datasets

Evaluation data for text-to-image generation

Baselines

Security

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Jielin-Qiu/MM_Robustness

Folders and files

Latest commit

History

Repository files navigation

MM_Robustness

Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift

Citation

Installation

Datasets

Generate perturbation datasets

Evaluation data for text-to-image generation

Baselines

Security

License

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages