Skip to content
/ amp Public

【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"

Notifications You must be signed in to change notification settings

takomc/amp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

【NeurIPS 2024】Automated Multi-level Preference for MLLMs


News

  • Our AMP is accepted by NeurIPS 2024 as poster presentation!
  • [2024/05/29] We relase AMP in arxiv! Our code, MRHal Benchmark, and models are now open source!

Overview

We present an automated Multi-level Preference (AMP) framework for Reinforcement Learning from Human Feedback (RLHF), which generates the high-quality multi-level preference dataset without any human/AI annotators and employs multi-level DPO (MDPO) algorithm. Our AMP achieves SOTA performance across multiple hallucination benchmarks, including MMHal-Bench, MRHal-Bench, LLaVA-Bench, and POPE.

image
image

Pipeline for Constructing Human-free Multi-level Preference Dataset

Prepare

  1. Install some important packages.
conda create -n amp python=3.10 -y
conda activate amp
pip install --upgrade pip
pip install -r requirements.txt
  1. Download Base Model

    llava-7b-base

    llava-13b-base

Train

  1. Prepare data from [RLHF-V], [SILKIE], [ShareGPT4V].

  2. Download Data from this link.

  3. Run the following code

sh scripts/13b-v1.5/train_dpo.sh    # 13B
sh scripts/7b-v1.5/train_dpo.sh     # 7B

Evaluation

MMHal-Bench

  1. Download data from [MMHal-Bench].
  2. Run the script
sh eval/eval_scripts/eval_mmhal.sh

MRHal-Bench

  1. Download data from [MRHal-Bench].
  2. Run the script
sh eval/eval_scripts/eval_mrhal.sh

LLaVA-Bench

  1. Download data from [LLaVA-Bench] and [COCO] images.
  2. Run the script
sh eval/eval_scripts/eval_pope.sh

POPE

  1. Download data from [POPE] and [COCO] images.
  2. Run the script
sh eval/eval_scripts/eval_llavab.sh

Model Zoo

You can also use our trained models for evaluation. We provide the lora adpater of each version.

Size Dataset Link
7B MEG MEG-7B
7B IG IG-7B
13B MEG MEG-13B
13B IG IG-13B

Dialogue Example

We provide several dialogue examples, with additional results available in the paper.

image

Citation

If you find this repository is useful, please consider star🌟 this repo and cite🖇️ our paper.

@article{zhang2024amp,
      title={Automated Multi-level Preference for MLLMs}, 
      author={Zhang, Mengxi and Wu, Wenhao and Yu, Lu and Song, Yuxin and Rong, Kang and Yao, Huanjin and Zhang, Jianbo and Liu, Fanglong and Feng, Haocheng and Sun, Yifan and Wang, Jingdong},
      journal={Advances in Neural Information Processing Systems},
      year={2024}
}

Thanks

Our code is partly based on [LLaVA], [LLaVA-RLHF], and [TRL]. Thanks for their excllent work!

About

【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published