Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
icml-2020-nlp authored Feb 18, 2020
1 parent e3407e1 commit 86ccde0
Showing 1 changed file with 51 additions and 2 deletions.
53 changes: 51 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,57 @@ Under review.
<br>Please be noticed that the Git log was intentionally omitted to keep the review process Double-Blind.
<br>We will redirect or update this repository after we receive the final decisions.

<br>
<br>This repository provides pre-processed dataset, source code, and pre-trained weights used for our experiment.

### The code will be released soon (within two weeks).
### The code, pre-trained weight and full documentation will be released soon (within a few days).

## Preparation
```
/--|fairseq-semsim/
|datasets/
|README.md
```

<br>`/fairseq-semsim` : The codes for our model. Modified from [fairseq (v 0.8.0 : 534905)](https://github.com/pytorch/fairseq/tree/5349052aae4ec1350822c894fbb6be350dff61a0) and [Rewarder](https://github.com/yg211/summary-reward-no-reference) repositories.
<br>`/datasets` (upcoming) : Our version of the pre-processed CNN/DM dataset and the pre-processing code. Modified from [PGN by See et al.](https://github.com/abisee/cnn-dailymail) following instructions of [BART (issue #1391)](https://github.com/pytorch/fairseq/issues/1391)

You can download our pre-trained weight from [here]() (will be released soon (within a few days))


## Fine-tuning the model
If you wish to fine-tune the model with [BART checkpoint](https://github.com/pytorch/fairseq/tree/master/examples/bart), please first follow the instructions from `/fairseq-semsim` and use the following command.
```
BART_PATH=/pretrained/BART/bart.large.cnn/model.pt
TOTAL_NUM_UPDATES=50000
WARMUP_UPDATES=500
LR=3e-05
MAX_TOKENS=1792
UPDATE_FREQ=32
python train.py cnn_dm-bin \
--restore-file $BART_PATH \
--max-tokens $MAX_TOKENS \
--task translation \
--source-lang source --target-lang target \
--layernorm-embedding \
--share-all-embeddings \
--share-decoder-input-output-embed \
--reset-optimizer --reset-dataloader --reset-meters \
--required-batch-size-multiple 1 \
--arch bart_large \
--criterion semantic_similarity_loss \
--label-smoothing 0.1 \
--dropout 0.1 --attention-dropout 0.1 \
--weight-decay 0.01 --optimizer adam --adam-betas "(0.9, 0.999 )" --adam-eps 1e-08 \
--clip-norm 0.1 \
--lr-scheduler polynomial_decay --lr $LR --total-num-update $TOTAL_NUM_UPDATES --warmup-updates $WARMUP_UPDATES \
--update-freq $UPDATE_FREQ \
--skip-invalid-size-inputs-valid-test \
--save-dir checkpoints/semsim \
--find-unused-parameters;
```
We followed most of default settings of BART. However, we removed a few options such as `--truncate-source` and `--fp16 `.
We used one NVIDIA TITAN RTX GPU with 24GB memory and it took 7~9 hours for a single epoch.

0 comments on commit 86ccde0

Please sign in to comment.