Token and Head Adaptive Transformers for Efficient Natural Language Processing

Official PyTorch code for COLING 2022 paper: Token and Head Adaptive Transformers for Efficient Natural Language Processing

The code is based on HuggingFace's (🤗) Transformers library.

Framework

Dependencies:

Python 3.7.3
PyTorch 1.8.1
🤗 Transformers
torchprofile

Usage

1. Prepare data

Prepare GLUE dataset with download_glue_data.py

2. Finetune with Token and Head Drop

From a checkpoint finetuned with a downstream task, continue finetuning with Token and Head Drop.

python run_glue.py --model_name_or_path glue_output/$TASK_NAME/$MODEL_NAME/standard/checkpoint-best --task_name $TASK_NAME --do_train --do_eval --data_dir glue/$TASK_NAME --max_seq_length 128 --per_device_train_batch_size 16 --per_device_eval_batch_size 16 --learning_rate 2e-5 --num_train_epochs 5.0 --output_dir glue_output/$TASK_NAME/$MODEL_NAME/joint_adaptive --length_adaptive --num_sandwich 2 --length_drop_ratio_bound 0.2 --layer_dropout_prob 0.2

3. Run Evolutionary Search of Joint Token and Head configuration

After training a Token and Head adaptive transformer, run an evolutionary search to find configurations with optimal accuracy-efficiency tradeoffs.

python run_glue.py --model_name_or_path glue_output/$TASK_NAME/$MODEL_NAME/joint_adaptive/checkpoint-best --task_name $TASK_NAME --do_search --do_eval --data_dir glue/$TASK_NAME --max_seq_length 128 --per_device_eval_batch_size 16 --output_dir glue_output$TASK_NAME/$MODEL_NAME/evolutionary_search_joint --evo_iter 30 --mutation_size 30 --crossover_size 30

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
token_head_adaptive_transformer		token_head_adaptive_transformer
README.md		README.md
THAT.png		THAT.png
download_glue_data.py		download_glue_data.py
run_glue.py		run_glue.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Token and Head Adaptive Transformers for Efficient Natural Language Processing

Framework

Dependencies:

Usage

1. Prepare data

2. Finetune with Token and Head Drop

3. Run Evolutionary Search of Joint Token and Head configuration

About

Releases

Packages

Languages

imj2185/THAT_glue

Folders and files

Latest commit

History

Repository files navigation

Token and Head Adaptive Transformers for Efficient Natural Language Processing

Framework

Dependencies:

Usage

1. Prepare data

2. Finetune with Token and Head Drop

3. Run Evolutionary Search of Joint Token and Head configuration

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages