Chip-Tuning

Chip-Tuning is a part of QQ MLLM project and this is the repository for paper Chip-Tuning: Classify Before Language Models Say with code and scripts.

Setup

To use and evaluate chip-tuning, you have to install the dependencies by:

pip install -e requirements.txt

And download related datasets and models by excecuting python code files in ./transformer_chips/download. An example for downloading the MMLU dataset is:

python ./transformer_chips/download/dataset/download_MMLU.py

This will download the MMLU dataset and save the dataset to ./data/datasets.

You can also manually download the resources, then put datasets under ./data/datasets and models under ./model.

Training & Evaluation

The scripts for experiments in the paper are stored in ./scripts:

./scripts/benchmark: the main experiment (Section 4.2);
./scripts/benchmark_mllm: the multimodal model experiments (Section 4.3);
./scripts/benchmark_lora & ./scripts/benchmark_lora_chips: combination with finetuning (Section 4.4);
./scripts/benchmark_validation: chip selection strategy (Section 5.2);
./scripts/benchmark/data_influence: impact of training dataset scale (Section 5.3);
./scripts/benchmark_llama3: Llama3 experiments (Appendix E);

See README.md under these folders for more details.

We also provide examples of solely evaluating trained chips on certain benchmark or making prediction with selected chips. See ./scripts/benchmark/misc for examples.

How to Add New Custom Tasks

You can add your custom tasks with two steps:

Define your classification task with the format in ./data/yaml;
Implement your own recipe and data process function in recipe.py and data.py.

You can also directly use the ChipTrainer class in ./transformer_chips/chip_trainer.py by passing your own task_dict, train_dataset and eval_dataset; or use the ChipedTransformer class in ./transformer_chips/ChipedTransformer.py to create your own trainer.

Generation Chips (Unfinished)

Different from classification chips (ChipedTransformer), the generation chips (ChipedTransformerForGeneration) currently need further exploration. Using them could lead to unexpected results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chip-Tuning

Setup

Training & Evaluation

How to Add New Custom Tasks

Generation Chips (Unfinished)

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
model		model
scripts		scripts
transformer_chips		transformer_chips
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

QQ-MM/ChipTuning

Folders and files

Latest commit

History

Repository files navigation

Chip-Tuning

Setup

Training & Evaluation

How to Add New Custom Tasks

Generation Chips (Unfinished)

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages