Robust Instruction Tuning on MLLMs

Official Implementation of the paper: InstrAug: Automatic Instruction Augmentation for Multimodal Instruction Fine-tuning

Introduction

InstrAug is a framework for instruction augmentation. It can expand extant small instruction set to up to 30x larger. The whole pipeline of InstrAug includes (as illustrated in the figure below):

Meta-prompt Generation
Augmented Instruction Generation and Rule-based Filtering
- Multi-temp sampling ($\rm MIns+_{\rm MT}$)
- Iterative rephrasing ($\rm MIns+_{\rm Iter}$)
Instruction-following Dataset Construction

We apply InstrAug to Multimodal Instruction Fine-tuning (MIFT) benchmarks and test on 12 downstream tasks from MultiInstruct and InstrutBLIP-Bench and the whole MMMU benchmark. The results show that the model's capability on instruction-augmented dataset (59K) is competitive to or even exceeds non-augmented but larger datasets (564K).

Repo Hierarchy

The file structure in this repository is as below, we only show important folders/files

.
├── IBLIP                   # Implementation code on Instruct-BLIP
├── OFA                     # Implementation code on OFA
├── MultiInstruct           # Code to create MINS+
    ├──llama                # Code to generate augmented instructions using LLaMA
    ├──mminstr_dataset      # folder to store MINS and MINS+ dataset 
    └──instruction_data     # folder to store original and generated instruction set 
├── LICENSE
└── README.md

Usage

Please refer to the README.md under individual folder for more details.

Results

1. Results on MultiInstruct

2. Results on IBLIP-Bench

3. Results on MMMU

Citation

Please cite our paper if you find this work useful for your research and applications

@misc{han2024robust,
      title={Towards Robust Instruction Tuning on Multimodal Large Language Models}, 
      author={Wei Han and Hui Chen and Soujanya Poria},
      year={2024},
      eprint={2402.14492},
      archivePrefix={arXiv},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Robust Instruction Tuning on MLLMs

Introduction

Repo Hierarchy

Usage

Results

1. Results on MultiInstruct

2. Results on IBLIP-Bench

3. Results on MMMU

Citation

About

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
IBLIP		IBLIP
MultiInstruct		MultiInstruct
OFA		OFA
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

declare-lab/RobustMIFT

Folders and files

Latest commit

History

Repository files navigation

Robust Instruction Tuning on MLLMs

Introduction

Repo Hierarchy

Usage

Results

1. Results on MultiInstruct

2. Results on IBLIP-Bench

3. Results on MMMU

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Packages 0

Languages

Packages