This repository contains the code for the paper Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning. The experiments on SuperGLUE is based on the code of MeZO.
The implementation of Random Masking can be found in PEFT/random_masking.py. It uses the spops library to perform the sparse matrix operations. Install spops library before running it. We also provide a naive version of Random Masking in PEFT/random_masking_naive.py for illustration, which directly stores the mask and tunable parameters in dense matrices.
Run Random Masking on RTE dataset using OPT-1.3b, with tunable parameter ratio 0.01%(masking probability 0.9999) and learning rate 1e-2:
MODEL=facebook/opt-1.3b TASK=RTE EPOCH=5 MODE=random_masking LR=1e-2 MASKING_PROB=0.9999 LOCAL_HOST=0 SEED=0 bash run.sh
The average performance of PEFT methods over various numbers of trainable parameters. Despite its simple design, Random Masking achieves competitive performance with fewer trainable parameters.