GLUE tasks

Here are the corresponding GLUE scores on BiGS.

Experiments

GLUE is made up of a total of 9 different tasks. Here is how to run the script on one of them:

We finetune BiGS on TPU-v3 with 8 cores. Since the batch size per device is 2, the total number of batch size is 16.

export TASK_NAME=cola

python run_glue.py \
    --model_name_or_path JunxiongWang/BiGS_512 \
    --task_name $TASK_NAME \
    --max_seq_length 512 \
    --learning_rate 2e-5 \
    --num_train_epochs 3 \
    --per_device_train_batch_size 2 \
    --logging_steps 100 \
    --eval_steps 500 \
    --weight_decay 0.01 \
    --output_dir BiGS_$TASK_NAME/

Those give us the following result

Task	Metric	Result
CoLA	Matthews corr	67.9
SST-2	Accuracy	93.8
QQP	Accuracy/F1	91.4/88.4
MNLI	Matched acc./Mismatched acc.	86.5
QNLI	Accuracy	91.6

For MRPC, STS-B and RTE, we finetune on the MNLI model

export TASK_NAME=mrpc

python run_glue.py \
    --model_name_or_path JunxiongWang/BiGS_512_MNLI \
    --task_name $TASK_NAME \
    --max_seq_length 512 \
    --learning_rate 2e-5 \
    --num_train_epochs 3 \
    --per_device_train_batch_size 2 \
    --logging_steps 100 \
    --eval_steps 500 \
    --weight_decay 0.01 \
    --output_dir BiGS_$TASK_NAME/

Task	Metric	Result
MRPC	F1/Accuracy	88.3/83.3
STS-B	Pearson/Spearman corr.	89.9/89.7
RTE	Accuracy	79.8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GLUE.md

GLUE.md

GLUE tasks

Experiments

Files

GLUE.md

Latest commit

History

GLUE.md

File metadata and controls

GLUE tasks

Experiments