Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs [ICML 2024]
Official PyTorch implementation of Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs.
Lu Yin*, Ajay Jaiswal*, Shiwei Liu,Souvik Kundu, Zhangyang Wang
University of Texas at Austin, University of Surrey, Eindhoven University of Technology, University of Oxford, Intel Labs
The code can be contacted at l.yin@surrey.ac.uk.
Table of contents
We present
Please check INSTALL.md for installation instructions.
We provide a quick overview of the arguments:
--model_name_or_path
: The identifier for the model on the Hugging Face model hub.--TASK_NAME
: the name of the fine-tuned tasks.--sparsity
: Denotes the percentage of weights to be pruned.--sparse_init
: Specifies the type of sparsity [sparse_nm
,sparse_unstuctured
] .--method
: a flag to of the output_dir.
--
vary sparse_method
with
--freeze_weights
: Sparse Transfer--freeze_weights_frompretrain
: Dense Transfer with Freezing- -or leave it emply: Sparse to Dense Transfer
cd ./GLUE_tasks
for seed in 41 42 43
do
for TASK_NAME in QNLI
do
for sparsity in 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
do
for validation_split_percentage in 100
do
python Glue_prune_oneshot.py \
--method Glue_noembed_freeze_weights \
--validation_split_percentage $validation_split_percentage \
--sparse_method \
--noembed \
--sparsity $sparsity \
--model_name_or_path roberta-large \
--task_name $TASK_NAME \
--max_length 512 \
--per_device_train_batch_size 16 \
--learning_rate 2e-5 \
--num_train_epochs 3 \
--seed $seed \
--output_dir ./roberta/Glue_noembed_freeze_weights/$TASK_NAME/$sparsity/$validation_split_percentage/$seed/
done
done
done
done
TO BE RELEASED SOON
TO BE RELEASED SOON
TO BE RELEASED SOON
TO BE RELEASED SOON
Task Difficulty Setting: Estimating LLM-Facing Task Difficulty by Normalized Human-LLM Performance Gap
TO BE RELEASED SOON
TO BE RELEASED SOON
TO BE RELEASED SOON
if you find this repo is helpful, please cite
@article{yin2024junk,
title={Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs},
author={Yin, Lu and Jaiswal, Ajay and Liu, Shiwei and Kundu, Souvik and Wang, Zhangyang},
journal={arXiv preprint arXiv:2310.02277v2},
year={2024}
}